for me, it’s pretty confusing in quite a few instances, but, just to give an example: if the striker shoots and the keeper saves it, but the striker taps in the rebound, how is xG counted? is it counted twice or just once? or, what if neither shot goes in?
In your example, there are two finishes and thus two cases of xG each of which is counted. It doesn’t matter for xG whether the shot actually ends up as a goal.
if the striker shoots and the keeper saves it, but the striker taps in the rebound, how is xG counted?
Two separate shots with two separate xG values.
is it counted twice or just once?
Twice, because there’s two shots.
what if neither shot goes in?
Nothing different, because if xG only counted shots which went in it wouldn’t be a very useful metric.
how exactly is xG counted? in some cases, it’s just confusing for me.
Here is an explanation of xG from a stats site. It may be useful.
xG is simply the chance of the goal being converted by an AVERAGE player. The value doesn’t tell us anything else.
A player outperforming xG (over a long period) is better than average, underperforming is worse than average.
Harry Kane for example has outperformed xG in all but one season of his career.
Gabriel Jesus on the other hand has never hit his xG ever in his career.
Now we dont need xG to tell us Harry Kane is brilliant and the Jesus misses loads of shots but it’s a stat that “proves” the eye test
They’re all just guessing
Welcome to x stats. Where the rules are made up and the stats don’t matter
Exactly, they’re so boring as well. I miss the times where goals and on some rare cases assists where all that mattered.
Except it’s not… It is literally the exact % of shots going in from a certain position which absolutely makes sense and is very straightforward.
It’s not perfect, no single stat is but this is a ridiculous take
Honestly I wouldn’t worry about it. I was where you were a few years ago and dug deep into X stats. It’s all rubbish and not worth looking at. They don’t really indicate anything worthwhile.
Hmm I say they do, just not what most people think, for example xG shows that Son is an excellent finisher consistently over many years.
He’s scored 111 premier league goals in 280 matches. I don’t need to see his xG over several years to know he’s a prolific goalscorer.
111 sounds nice but what if he had 1000 chances ? Then he would be a good goal scorer but not a clinical finisher.
Okay, but goalscoring is not just about finishing, it’s about getting yourself into dangerous positions to shoot from. xG is not the be all and end all, but it tells you how effective a player is when it comes to finishing chances from given positions. Say you had 2 sons, both with 111 goals in 280 matches, but one had an xG of 130 and the other of 90, you would clearly be putting your money on the latter as that demonstrates they are more clinical.
But would we not be able to determine this simply by watching Son play on a regular basis?
To an extent yes, like for sure I knew son was good, but I didn’t realise how clinical a finisher he was without the stats. I kinda just assumed he missed as much as everyone else, but you just forget about them, cause how many shots that miss do you remember over a long period?
Why not both? Stats like xG are just another tool to help judge the quality of a player
True, not saying it can’t be used. But the above comment was saying using xG tells us how good a finisher Son is, and I was saying we don’t need xG to tell us that - we can just watch him play.
My general concern about stats, is, as some people above have said, is that people use it as a means to end all means to judge the effectiveness of a player, or how good a player is. But a lot of the time, we can use the eye test to effectively gauge how good a player is. Not saying stats can’t be used, but rather that not in the way it’s commonly being used now.
The xG ‘score’ will only take into account one shot in the sequence, whichever shot has the highest xG.
How can they even find the average chance of scoring when there are so many factors playing in?
Where is the player on the pitch? From which direction did the pass come from, and at what speed? Was it a high pass, low, did the ball bounce or change direction on the way? Where are are the defenders positioned when the player is making the shot? Is the pitch dry/bumpy/wet? Just some random examples of the million factors playing in.
Which factors matter and which do they ignore? I don’t think they have good enough data when considering how “unique” most goal attempts actually are, all factors considered.
Player position and defender position is factored in. The other intricacies you mention are why xG is not a suitable metric for the accurate depiction of a single game or even several games.
Rather, it’s far more suited for analysing performance over tens of games or entire seasons and can help to highlight problems in chance creation, finishing, or with specific players etc.
You are correct that by including a lot of factors they would have to check they have enough data to estimate each factor properly. However, they do have a lot of data. Opta says they use a database of millions of shots. That still might not be enough if they include too many variables in the model - just be cause of the sheer number of combinations of variables that are possible. From what I’ve read about this they seem to be pretty disciplined in what they include.
In any case, the things the model won’t predict well are more likely to be rare events that don’t show up many times in the training data. But since those shots are rare anyway, it doesn’t matter so much. You have a trade off between including enough variables in the model to predict most events really well (learn from the huge amounts of data that you do have on common shooting chances in football), or you could in theory limit the ability of the model to predict things so as to be “less surprised” by outlier events, but nobody really would want that.
You have an intuition that most goal attempts are “unique”, but xG seems to do pretty well on average. That “uniqueness” is still captured by the model in terms of variation around the expected number of goals but unfortunately they never present that.
I personally would like to see some information about xG uncertainty, because that WOULD include information about the rarity in the training data. A super rare shot should in theory (assuming a good model) have more uncertainty about the xG.
So if we saw things like an xG range of plausible estimates instead, say it was xG-range 0.7-0.9 because all we had in the game was 1 penalty, then that is pretty common and the uncertainty can be low (narrow range of plausible values). But say instead of a single penalty there was like 5 half chances with rare events included we might see something like xG-range 0.2-1.4 and both cases could have the same total xG overall but by presenting the uncertainty we have a much better understanding of what the model actually predicts and what type of game we are dealing with.
It’s just an opinion that fools people into thinking it’s a fact/stat because they use a number with decimal places
All those new weird metrics are confusing in general and are mainly used by internet nerds to prove that their favourite player is better than others. I’ve been watching football for decades and before the 2010s people barely cared about who had a lot of assists, now you have all those weird xG, xA, dribbles per 90 and I don’t even know what else.
Real football fans just watch games and judge performances that why, I wouldn’t give a shit if my team’s striker had awful “expected” stats but still delivered.
So xG models football as a random process. Doing this, we can predict the chance of a goal being scored for every shot taken. That probability is calculated based on things like where the shot was taken from, and what part of the body the shot was taken with.
To get the xG for a team or player, we simply add up the probabilities of all their individual goal scoring opportunities, because on average, if football were a random process, that’d be how many goals they’d score. This gives us a way of measuring the quality of goal scoring chances a given team or player is presented with.
Of course, real football isn’t a random process, and there are players or teams that can consistently perform better than the model predicts (e.g. Kane, Son, Haaland), or worse (Chelsea). However, these models do fit on average, and they can richen our tactical understanding of the game if applied correctly.
The maths nerds amongst you may ask how we calculate that probability. This is typically done with a logistic regression on loads of data from competitive matches. The output of this is a function relates the log-odds of a goal being scored to a linear combination of some input variables, which we can then validate against another data set to make sure the model fits. Those input variables change from model to model, and how many variables you can usefully fit will depend on how large your training and validation data sets are, but the ones I mentioned earlier are a solid starting point. Extended models can incorporate things like the position of the keeper and defenders, or where the shot was placed (xGOT).
For anyone who’s interested in getting really nerdy about this, here’s a freely available course on football analytics, though you’ll need some statistical and coding knowledge to follow along - https://soccermatics.readthedocs.io/en/latest/
I always found it stupid. A shot taken by Harry Maguire whilst he’s falling backwards in a December snowstorm surrounded by ten defenders, gives the same xG as a shot taken by Haaland calmly picking his spot vs the keeper one on one on a sunny August afternoon - if taken from the same location on the pitch.
xG is a yank statistic. Whenever someone adds in random numbers like 5.86 xG, I’m always sarcastically like “WAOW so that means team would have scored 10 goals this match.”
Wtf is xG
Every stats compiler has their own method and algorithm to calculate “xG”.
It is a fake stat, ignore it.