The below was originally posted on the superb Shots on Target blog, although the version here has been modified a bit. The post introduces my own personal team ratings model based upon an amalgamation of various shot and chance data from Opta Stats. I have great confidence in this model and have been happy to watch as it foreshadowed, among other things:
- The rise of Southampton as a legitimate threat
- The improvement and underlying quality of Liverpool
- The overvaluation of Chelsea’s early season performance
- The putrid nature of Sunderland’s early season attack
I feel I was a bit ahead of the curve in identifying these constructs and have the model to thank (admittedly, I have missed on a couple as well). Going forward, I will publish model based weekly team ratings on the team ratings page and utilize these in my evaluations and predictions. Hopefully, you will grow to enjoy these ratings as much as I.
Thus without further ado, let me introduce SoccerSaber’s team ratings!
Football statistical modeling is in its infancy. While no doubt major clubs have squads of statisticians with proprietary algorithms defining player and team value, the general population has been left to look at goal records and the league table to determine the quality of player and team alike. That all changed with OptaStats. Now, everyone can see the story behind the game. We can look at the activities throughout the pitch and begin to ascertain which of these lead to goals. Further, we can begin to see which activities indicate innate ability and which are simply the luck of the draw. By combining these we can come up with forecasting models for both team and player, a Holy Grail for fantasy football managers.
While @shots_on_target has primarily focused on player evaluation, I have spent numerous hours over the past months hypothesizing on team value. I have worked to understand the underlying activities that drive goals scored and allowed, allowing me to construct team value models that I believe are far superior to the league tables. While still a work in progress, I am confident enough in these models to share them with you.
It has been known for some time that shots, more specifically shots on target, are great predictors of goals. Teams tend to score on about a third of shots on target on average. More importantly, teams that exceed or fail to achieve that rate one year tend to regress towards the mean the following (see James Grayson’s excellent blog for more discussion). As a result, we can use shots on target rather than goal scored as a better indicator of team performance, mainly due to the sample size issues in goals scored (logically, there are about 3 shots on target per goal scored). This helps us identify teams that maybe underrated or overrated based upon goals alone, especially early in the season when sample sizes are low.
But are shots on target enough? It certainly is a start and much better than plain old shots or goals scored as a forecaster. Yet, to me it seemed…wanting. I looked for some other factor that may do a better job of explaining things.
What I have found is that shots data in combination with “Big Chances” (BC) result in a stronger correlation than shots data alone. BCs are defined by OptaStats as follows:
Big Chance – A situation where a player should reasonably be expected to score usually in a one-on-one scenario or from very close range.
In my mind, BCs represent shots on steroids. Adding BCs as additional factor results in the following improvements in goals scored projections over the past three seasons:
2010 – 1.8%
2011 – 2.9%
2012 – 9.7%
While the improvement is not substantial, it is consistent enough for me to have some confidence in the model. For goals allowed, the improvement is even starker, although the data set available currently only goes back a single season:
2011 – 7.5%
2012 – 15.2%
It is clear the available information strongly suggests the SoccerSaber model is an improvement over a pure shot model, both for measuring team attack and defense.
So there you have it. As mentioned, I will be referencing these ratings consistently moving forward.