Roy Hodgson, the English national team manager, made a short speech about the uselessness of advanced statistics in football at Leicester City’s Tactical Insights conference. Though we don’t know what was said exactly, the reaction on Twitter from soccer analysts was swift. One, Ted Knutson, who works in player recruitment at Brentford FC, wondered whether Hodgson simply had a bad experience with a few crappy analysts. Knutson wondered whether it was time for ‘good’ soccer stats people to write a manifesto:
I’ve been writing about developments in football analytics for the past five years. Though I’m not an analyst, I do think I have a basic handle on the aims and limits of stats analysis in soccer in 2016. So while I won’t write a “mission statement,” I will address a few common misconceptions about stats in football.
Football is a wonderful, dynamic, and often inscrutable game. Few analysts would disagree there. Even so, football today is subject to new data gathering techniques, like the use of so-called ‘counting stats’—shots, tackles, clearances etc—or more complex player positioning data. These provide more than enough information for analysts to do good work.
For example, if you have two data sets, you can run linear regressions to see if one correlates strongly to another. We know for example that, in the long term, teams that take more shots as a rule score more goals. At one point it was vogue use this information to make reasonable predictions about teams will finish at the end of the year. Now analysts can measure the likelihood that a certain shot type taken from a certain position will go in the net, and use this information to judge how good teams are at creating dangerous chances (hint: closer is better, duh duh duh!). This is all obvious, right? So you might therefore think that…
On the contrary, here’s a very basic scenario where these ‘obvious’ stats can be useful. A team plays ten matches and loses seven. The manager comes under pressure and the director of football wonders what to do. So she consults the statistical analyst, who no doubt spends most of the day twiddling their thumbs in some chilly back office which smells faintly of cheese.
In between phlegmatic coughing fits, the 14 year old analyst tells her that the data suggest the team is creating heaps of dangerous chances and conceding few. In football, this is a very difficult thing for any team to do consistently, and therefore it correlates strongly with more wins in the long term. On paper at least, if, in some horrible Groundhog Day-esque scenario, the team played these same ten matches with the same chances 10,000 times over, 80% of the time they would have won 8 of them. In other words, the losing streak may be more down to simple bad luck than bad management.
The director of football, still understandably skeptical, asks the performance analyst to take a closer look. They come back to her and say, yes, our team took a lot of scuffed shots and saw a lot of incredible opposition saves. They also conceded a few very softs goals. The director doesn’t sack the manager, and the club eventually makes a respectable 7th place finish. The system works!
Not most of the time, no. But they don’t negate the importance of these qualities either, and sometimes they can be used to argue for their importanace. Analyst Daniel Altman for example did just that, writing that while Chelsea’s decline under Mourinho this past season wasn’t picked up the numbers, it seemed to start in earnest after Jose Mourinho’s blistering, arguably misogynistic attack on the team doctor, Eva Carneiro. Altman did use stats to make his case, but of a very different sort—he showed that most players on the Chelsea first team not only had families, they also had daughters. As Altman speculated,
So stats and character need not always be in opposition, except, perhaps, in certain newspaper op-eds.
Well, no arguments here. What stats can do at least is act as a filter to ensure that the scouts don’t waste their time flying to watch players who are almost certainly duds. Stats can be great at this, in fact. For example, you can use player age spread or Elo scores to rate respective leagues and clubs to get a better sense of their relative quality. From there you can limit your search to players who’ve played at least 70% of available injury-free minutes, so you know you’re getting someone good enough to consistently start matches.
Then if you’re really picky, you can use more advanced data to judge how well these players either creates or prevent goal-scoring opportunities. But this is only one part of a thorough process. Once the team has made up an appropriate list, it’s still up to the traditional scout to go see if they’re rubbish or not, if they have an attitude problem, if they’re ugly etc.
Most stats analysts I talk to are as passionate, knowledgeable and romantic about football as any broadsheet writer, pundit or fan. They are also competitors. They don’t want their team to make stupid, obvious blunders. More than that, they don’t want their clubs to blindly follow the same counterproductive practices everyone else does, like signing players with goal tallies inflated by penalty kicks, or firing a decent manager suffering a clear run of bad luck.
Though a lot of stats work is about marginal gains—taking better corners and free kicks etc—its most important function is to help teams avoid making silly mistakes. While many in football believe, cynically, that money is destiny when it comes to the league table, good analysts believe most clubs have only just begun to explore smarter ways of using data to do better business. None of this has to do with taking romance out of the game. Shitty transfers are boring. Sacking good managers is boring. Chucking in 60 crosses a game is boring. Analysts simply want to help reduce or eliminate all this so teams can better focus on the stuff that makes football great.