Sitemap

What does a football data scientist do?

5 min readMay 13, 2021

--

You have watched hours of tactics videos. You have attended video analyst courses. You have taken a coaching license and trained a local youth team. You have read all the books and every article on the Barcelona Innovation Hub. You have watched the YouTube channels and listened to all the podcasts.

You have studied advanced mathematics: linear algebra, statistics, machine learning and calculus. Learnt Python and R. Understood SQL. You know how to create an interactive visualization using JavaScript and Power BI.

You have a degree in mathematics or engineering. You might even have gotten a PhD in theoretical physics….

And during all the time, you have written to clubs and national teams. Told them you will work for free, only to get a foot in the door.

And now you have finally made it. A club has put faith in you and employed you as a full-time data scientist. What can you expect at your first day at work?

Well, probably, someone will ask you if you can get their laptop connected to the printer.

Then everyone will head off to watch the first team training.

Once back they will be extremely busy preparing for the next match, watching videos of the opposition, talking through set-pieces. Once that is done, they will play a very intense table tennis tournament, which cannot be disturbed until it is finished and it is established who is the best table tennis player in the club. After that, they will go home to spend the evening watching games and players clips.

This is what a football data scientist does all day. You sit there, watching a table tennis game (you were eliminated in the first round), wondering what exactly it was you spent all that time training for.

The above description is exaggerated, but only slightly. The reality of a football club is very different from the abstract world of data science and analytics. In many cases, a club has employed a data scientist because the board and the business side thought it was a good idea. They heard about Liverpool’s success using data, have seen a couple of tweets about expected goals and decided that, finally, they need to have one of those analytics people too. The idea from the board is that data science will make the club more professional. Those working day to day inside the club are resentful that the board doesn’t just trust them to do their job.

What is your way forward then as a data scientist?

It is best to take small steps.

One of the first questions I looked at was the effectiveness of half-time talks. The sporting director said that he felt that the manager’s half-time talks weren’t inspirational enough. They were too factual and not enough emotion. The players seemed to come out after the break with their arms hanging down and hunched shoulders. They weren’t standing up tall and he felt we came off worse as a result.

This was a statistical question. Were we stronger or weaker after halftime? I crunched the numbers, comparing our expected threat (quality of attacking play) and expected goals for the first fifteen minutes after the pause to the rest of the match. I also looked at similar statistics for other teams. The conclusion? We were strongest during those fifteen minutes. There was no evidence the manager’s half time talks weren’t working. On the other hand, we could see that we were always fading the last 15 minutes. We could start to think about ways to work with fitness and mentality during the last part of the match.

The question about half-time talks didn’t come up in a formal meeting about how we could use analytics. It came up while sitting in the dugout watching training. Football doesn’t work in a linear fashion — meeting, decision, analysis, result — it works through a web of complex communications — over coffee, training and table tennis tournaments.

It is answering small questions like this which builds confidence. And listening to how people in football work allows you to learn.

When I talk about learning, I don’t just mean learning tactical insights or understanding the manager’s style of play. I mean learning about the flow of the working week. There is a reason for the table tennis tournament. It is a release of tension. It is a way of bonding together, for everyone in the club, because we do this together. There is a reason why the coaching staff don’t fully trust the board: they can be fired at any time, just because of an unlucky run of results.

Those working in football don’t necessarily want statistics to remind them of the pressures they are already under. There are very few jobs in the world, where your performance is assessed every Saturday in front of tens of thousands of passionate (and often inebriated) fans, all of whom have an opinion. There are few jobs where failure over 90 minutes leads to an internal darkness, which you have to push yourself through for the sake of those around you. The pressures are already large enough. It is those pressures you have to learn.

From there it will come.

And then you can start. You can start to look at why the teams form is dropping in the last 15 minutes: who is running back and who isn’t and why. You can start to look at whether the overloads the manager is trying to create on the left are working. You can analyze when a long throw-in works and when it doesn’t. You can start to develop Key Performance Indicators (KPIs) for the team and players. You can investigate how these can be used in scouting new players.

Then you can start to build an IT system around what you have learnt, providing automated insight that reflects the culture of the club. Then you are a football data scientist.

But never forget. It is also your job to make sure everyone’s laptop is connected to the printer…

A modified version of this was originally published at https://barcainnovationhub.fcbarcelona.com in June, 2022.

--

--

David Sumpter
David Sumpter

Written by David Sumpter

Books: Four Ways of Thinking (2023); The Ten Equations (2020); Outnumbered (2018); Soccermatics (2016) and Collective Animal Behavior (2010).

No responses yet