The metric du jour in soccer circles these days is—still?—Expected Goals. Though it has been the subject of criticism and debate from all sides of late—on the anti-analytics side for nerdifying/Americanizing pwopah football even more than it already is, on the pro-analytics side for being the one and only major metric produced in soccer circles over the last three years—it is firmly entrenched in the soccer discourse.
We know this, in part, because it was the subject of a Twitter debate between some prominent soccer journalists yesterday, with Guardianistas Sean Ingle and Rafa Honigstein on one side and Miguel Delaney on the other (by the way, you should read Honigstein’s highly insightful article that sparked it all).
Before we go on, a clunky definition: Expected Goals or ExpGs or xGs use average shot conversion rates based on shot a) type and b) location to establish how many goals would normally expect a team to score based on their shots. From this you can get a range of probable score lines over one or several matches based only on shots taken.
That word—’probable’—is too often overlooked. To understand why, here’s another way to think about ExpGs. Imagine a game in which Team A takes 12 shots, Team B takes 5, with the result finishing 0-1 for Team B. Now let’s imagine you play this game over and over again in some Monte Carlo simulator in the sky, 10,000 times over. There, the most frequent outcome is 2-1 in favour of Team A. This doesn’t mean Team A should have won; it just tells us that it was the most probable outcome among a range of score lines running between 12-5 and 0-0 based on the shot types (to really see this effect, see Danny Page’s excellent xG simulator).
Anyhoo, what do teams and fans use this metric for? And what maybe should they use it for? Here are some possible uses/misuses.
A team wins a whole whack of matches everyone expected them to lose, they rocket to the top of the league, and everyone goes bananas! Then some stupid nerd spoils the party and points out that the team is actually really, really lucky, and the type of shots of they took would normally put them several places lower in the table. Think this season’s Leicester City, for example. You might think, “Why does it matter, they’re going to win the league!”, and you’d have a point. But the Foxes can use this info to ensure they prepare for a regression to the mean next season, or at least not sack Ranieri when it happens, which leads us to…
Expected Goals are useful for telling a club, “Hey, in the best of all possible worlds, the kind of shots your team is taking would give a better goal difference and move you up in the table four spots. Also, that losing streak? Mostly just dumb bad luck.” This information is useful for a number of reasons, but the biggest one is avoiding sacking a perfectly decent manager after poor run of performances that had more to do with shitty luck than anything else.
This one’s pretty straightforward. If you find a player who scored 18 league goals for some no-name club in Europe, ExpGs can help provide some basic quality control. If their ExpG total from that season suggests they should have scored only 9, and if that is the only season in which the player exceeded their xG total, I would either lowball an offer or pass altogether. Note that this is a made up example and player recruitment is often a much messier business. But I would rather know xGs than not know them.
You can use xGs for all kinds of neat stuff. Which passing combinations lead to higher quality shots? Which game situations? Open play or free kicks? Which defensive lines lower xGs? As part of a comprehensive analysis, xGs do a bang up job in breaking down “ideal” attacking situations. But! There is a limit to all this, which brings us to…
If you see any linear regression model involving ExpGs to Actual Goals, you will almost always see the usual suspects—Barcelona, Bayern Munich—as outliers in the top right hand of the plot. The same goes for elite players like Messi, who are—duh—above average finishers. If Messi takes a shot from the same location as any other player on earth 10,000 times over, he’s going to score a lot more often because he’s Lionel Messi. And if you have a team with Messi-like players, you’re going to “overperform” xGs fairly consistently. I don’t know what this means, except that I think we can say “overperformance” among the super-elite is probably more than just luck, year-in year-out.
Single game xG comparisons are fine if you want to see how dangerous a team’s chances were in a particular game, but drawing conclusions about those numbers from single matches, either positive or negative, can be misleading. The ability to generate a higher Expected Goal tally is a statistically repeatable effect, but not after one game. There random variation can still apply. Therefore, criticising or praising a side based on single game xGs, outside of offering concrete evidence for something a team may or may not have done deliberately—tactics, poor decision-making etc—is asking for trouble.
Seth Partnow recently wrote a compelling post for Vice on Goodhart’s Law, which, according to Partnow, contends that “when a measure becomes a target, it ceases to be a good measure.” Actually, Goodhart, an economist and former adviser to the Bank of England, actually said, “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” Expected Goals is probably best used as a diagnostic tool; jimmy-rigging tactics to bump up xGs in the hopes you’ll score more goals, say, by only taking shots from high-pay off locations, is to risk potential unintended consequences, like creating more opportunities for the opposition to break, or leading to more lost possessions in the final third etc. If however these can be controlled for, than teams by all means should seek to increase xGs for and decrease xGs against. But the means should be fundamentally holistic and tactical, not “take shots from here only.”