Explaining Expected Threat
One of the key questions for everyone interested in football — from coaches, through scouts to the fans — is how do we assess the quality of a player using data. If they score a lot of goals they must be good and, more recently, we have understood that finding good scoring opportunities (having high xG) is also good. But what about all those passes, dribbles, blocks and interceptions. How do we value them?
It is with this in mind that The Athletic have started using expected threat when talking about player and team performance. The idea is to assign a value to every point on the football field based on the probability that having the ball at that point will lead to a goal. One example of these probability maps is shown below.
In order to evaluate actions we look at how an action changes the probability of scoring. It is this change in probability of scoring which is the expected threat (xT). If a player makes a pass which moves the ball from a place where it is unlikely for their team to score, to a place where they are more likely to score, then they have increased the xT in favour of their team. In general, the nearer you get the ball to the goal the more likely your team is to score (although if you look carefully passes back to the goalkeeper are also valuable).
More details of how expected threat is calculated can be found in Friends of Tracking in this video.
Expected Threat was invented by Sarah Rudd in 2011. She didn’t call it that. In fact, she didn’t call it anything, but she had the mathematical insight, using Markov chains, on which it is based. In this video you can see her go through all the steps. And, on that basis she was recruited to StatDNA, who were very soon after bought up by Arsenal. The name xT was first used by Karun Singh, who reproposed it in the public sphere in a blog post in 2018.
It is extra important that when we have a clear example of an idea from a female scientist in a male-dominated area, which is now used everywhere, that we pause to make sure everyone knows where it came from. There is a history of womens’ contributions being forgotten in Science. It would be embarrassing if we made this same error in the so-called modern era, especially in football.
So when we hear about how Liverpool used expected goals added (yes, that is expected threat) in recruitment during 2018–19 or we about how Opta and Statsbomb have there own version of expected threat, remember that all this came from the work of one very determined young woman, more than ten years ago, who went to as many sports analytics conferences as she could and pestered everyone she met until she got one of the first ever jobs in football analytics.
(I wrote about Sarah Rudd in Soccermatics and I have booked in her in for a Friends of Tracking video during the autum, so there will be a chance to hear more about her story soon)
Now, with that is cleared up, I want to make a more subtle point. There are a lot more ways of measuring expected threat than there are ways of measuring, for example, expected goals. Indeed, the technique we use at Twelve football is different than the one outlined above. And, dare I say it, better… (when I say this, I don’t mean it is better than what Rudd went on to use at Arsenal. A lot of development was made between her talk in 2011 and now. But it is the best way to implement xT if you have, for example, Opta, Statbomb or Wyscout event data.)
Our logic is as follows. Football is a dynamic game. A pass does not just have value because of where the ball ends up, it also has value on how it shifts the defence. So when we calculate the threat of a pass, we include both the start and end co-ordinates of the pass. We also include qualifiers, such as whether it was a cross or a through ball. This prevents, for example, us overvaluing ‘hopeful’ crosses in to the box.
The threat then lies in how the ball is moved, rather than where the ball is. This means that cross-field balls get higher value and backpasses don’t give minus points, as they might in a purely change in xT method. Here are some examples of valuable passes based on a possession chain method by four of the best passers from the last third of last season.
Here we see how the model picks up the value Trent Alexander Arnold adds with cross balls. He was ranked 3rd in the Premier League per 90 during the last third of the season. Mount’s threat (ranked 5th) comes from shorter passes in to the box. If you are wondering, Thiago was ranked first in the last part of the season.
We include much more than just the start and end co-ordinates when fitting the model. Here are the top xT producers through passing and dribbling per 90 for the whole of last season .
The method we use exploits the power of possession chains. Every sequence of play is grouped together based on who had the ball. A chain is broken if the team scores, the ball goes out of play or the opposition touch it two times or more. The video below explains how we use that to measure the value of a pass.
This method, for which Twelve have made player rankings publicly available for over four years now. Although we now keep our online ratings site a bit hidden (because we would like you to download the lovely, colourful app) you can look at them here.
I never really used the term Expected Threat that much, preferring Pass Impact or Pass Value or even Markov Model. But I think its important we use the same terminology, so that we understand each other. So, I am going to follow Tom Worville from the Athletic on this one: from now on I will call this statistic — which (attempts to) measure the probability that an action will lead to a goal, which was invented by Sarah Rudd and can be implemented in a variety of ways (of which some are better than others) — Expected Threat.
And just to prove my commitment here is a Twelve xT for Brentford against Arsenal.
Maybe slightly nicer than, as highlighted by Tom, Oracle systems attempt.
In my next Medium post I will present our professional scouting version of expected threat, which includes tracking data (player positions). Stay tuned!