Why we have to be modest about football analytics
Football is not baseball. Football is not basketball. Football is not American football. And football certainly isn’t cricket. Football is unlike any other team sport because, and I hope I don’t offend fans of these other sports by saying this, football is the most complex of sports.
What do I mean by complex? Well, it is that definition we are going to look at in this article, because before we recognize the true complexity of football (and how this makes it different from other sports), we can’t start to find ways to understand the game.
The complexity is important to get straight, because irrespective of whether you are the technical director of a big club, coaching at the grassroots level or a fan who knows her club inside out, you will have lots of companies and specialists telling you that they can solve football. That their product — which measures expected goals or passing networks or fitness data or something else — can give you the insight you need.
And you will often feel uncomfortable, because you will wonder whether it really can be as simple as they say? When you think of football, you feel it deep inside your body. You can feel the movements. You can visualize what good football looks like, what it feels like to see or hit the perfect pass. Football is intuitive for you, deep inside the way you think and move. You can see a story in the game. It is this feeling that is football.
And surely this feeling can’t be captured by a statistical measure, by an App, by a visualization, by Key Performance Indices or by a mathematical equation?
There is a modern notion that your feelings are wrong and the numbers are right. During my time working in football, I have often heard pitches and presentations which start with quotes from Daniel Kahneman’s book Thinking Slow and Fast. Kahneman’s research is outstanding — he and others have identified and documented human biases — but… the way his research is presented by others often suggests that it is statistics and logic (type 2 thinking) which you should trust (in particular, the statistics used by the person citing Kahneman…) and your intuition and ‘gut feeling’ (type 1 thinking) which you should see as biased or incomplete.
In fact, this is not really what Kahneman claims, nor is it anything near to the truth. In reality, sometimes type 2 thinking works best, and other times type 1 thinking is better. Take for example, what has come to be known as the ‘hot hand fallacy’ in basketball. This is the idea that a player who has just scored is more likely to score again. Initial research suggested that this intuition — shared by fans, coaches and players — was a fallacy: that there was little or no correlation between success in one shot to the next. But research by psychologist, Markus Raab, suggests a subtler picture: larger studies do show a hot hand in sports and defences might adjust for the hot-hand of their opponent, thus making it harder for the ‘hot’ player to score. The hot hand fallacy is itself a fallacy! A fallacy arising from too much faith in analytics and type 2 thinking.
The work of Raab, and his colleague Gerd Gigerenzer, shows that it is in complex situations where our heuristic understanding of situations that helps us most. In these cases, it might be yourfeelings and intuitions that are the best model of a situation. For example, the ‘take-the-first’ heuristic, which suggests players should shoot or pass based on the first option that occurs to them is widely used by the best and most experienced professional players in handball and basketball. It is intuitive, but it works.
And this is why we need to define complexity, because there is a link between the best way to approach the study of a sport and the complexity of that sport. The more complex a sport, the less we should trust methods based purely on statistics. That is not to say we should just go with our intuition, either. But it does mean we should start by knowing what we mean when we say football is complex.
When characterising complexity we usually talk about several different features. One is that a complex system is non-linear. Teams in football are more than the sum of their parts: pressing, for example, only works if everyone in the team presses. Another is the complex systems are open, they are not closed off from the external world. For example, the crowds, now returning to the matches, form an ever-changing external input that influences how the players feel and perform. Think of some of the great turnarounds in Champions League knock-out stages. The players are not robots, they are people whose feelings are intertwined with their performances. A further feature is that complex systems are historical. Barcelona has a style and a way of playing which has evolved over decades and can’t be summed up purely in terms of how a particular pass is made now.
Most importantly, complex systems are dynamic. Natalia Balague, professor at the University of Barcelona, and pioneer of the idea of complex systems in sport, writes that training exercises for team sports should not ‘inform the athlete about a theoretically ideal motor output, but create tasks where skill can solve the constantly changing situations’. She quotes a leading Spanish coach as saying of his team, ‘When I see them moving like a flock of birds I know they are playing well.’ Complex systems, from bird flocks to football teams, never stay still, are always changing form, but also appear to have an underlying structure.
Non-linear, open, historical and dynamic… Compare it to other sports on these terms… Football is the most complex of sports.
Cognitive scientist, Abeba Birhane writes that modern statistical and machine learning tools try to “impose order, equilibrium, and stability to the active, fluid, messy, and unpredictable nature of human behaviour.” The argument that Birhane puts forward — that this attempt to impose order ultimately fails and the consequences can often be negative for the people who are modelled — follows from a philosophical tradition which says that there is never one true way to understand a complex system. It is this same insight, which many of us feel but have difficulty expressing, that makes us feel uncomfortable when a company or individual claims to have a complete solution to the game we love. Football is just too complex to understand with a single model.
From the arguments above, we have established two things: (1) we should (partly) trust our intuition when studying a complex sport like football and (2) football is so complex we will never truly understand it. How then can we use mathematical models and analytics to improve our understanding of the game?
Birhane recently wrote Twitter threads reviewing the work of two leading complex systems philosophers, Alicia Juarrero and Paul Cilliers. What these complex thinkers emphasize is a modesty in how we use mathematical models. Following their advice, we need to start by admitting that there is no single way of seeing football (or any other complex system) that will solve all of our problems. Rather, there are different ways of seeing the game: different models, different concepts, and different ideas. We shouldn’t argue that everything else is wrong, but rather find different ways of being somewhat right, some of the time.
This is exactly what I am going to do in this series articles. I am going to find ways in which we are somewhat right, some of the time about the complex game of football. This is going to take us from models of team performance, player recruitment, off-the-ball movement, formations, free kick and penalty-taking. We are going to look at the impact Moneyball’s Billy Beane is having on English Championship team Barnsley, the latest developments in data collection, find out when football is chaotic (and when it isn’t), whether or not Google can ‘solve football’ like it solved Go and Chess, what the job of a football data scientist looks like at clubs (and how it should look) and many other questions.
This won’t be just one view, it will be many views of the game: the complex game. The most complex game of them all.