INTRODUCING... A REGRESSION BASED PLAYER RANKING SYSTEM

Whilst averages and strike rates are excellent tools for assessing player performance, they often don't tell the full story. Here we present a regression based analysis method that takes every delivery in any given period to develop a new player ranking system which accounts for opposition quality.

The method works by applying two regression models to the ball by ball data provided at cricsheet.org, using the bowler and the batter as predictors for the various possible outcomes.

The first of these models is a logistic regression model, and generates a model that simply predicts the likelihood of a wicket being taken given who the bowler and batter are. An output of running these are coefficients for each player, which can be considered as their particular contribution to the possibility of a wicket. For the bowlers, the higher the better, and this can be considered as a measure of wicket taking ability - effectively replacing strike rate. For the batters, the lower the better, and we can call it their wicket preservation ability.

The second model is a linear regression, based on the negative binomial distribution, and predicts the number of runs scored from a delivery, again using the bowler and batter as predictors. Once again the player coefficients are extracted from this model and used as metrics with which to rate the players - for bowlers this is effectively a measure of economy, that is, their ability to not concede runs. For the batters this represents their run-scoring ability, and in terms of traditional measures can be akin to strike rate.

The advantage of these when compared to just looking at traditional measures, is that it accounts for the the strength of opposition. So for instance, a batter surviving an over from Mohammed Amir improves their wicket preservation rating more than it would for, say, Wahab Riaz. Similarly, keeping it tight when bowling to Glenn Maxwell is worth a lot more than joining up the dots against someone like Hashim Amla.

So far we have only looked at a number of one-dimensional ratings. Now it is time to combine them. This produces a ranking of both bowlers and batters that takes into account each of the abilities worked out above. In cricket - particularly in ODIs and other short forms of the game - there are a variety of different situations which may alter how we perceive the value of these separate abilities. At the end of an ODI innings, for example, wickets become less important as the number of overs becomes the scarcer resource for the batting team. In these situations we value bowler economy and batter strike rates more heavily, and wicket taking and preservation become less important.

Because of this, we introduce two new metrics - α, for the batters and β, for the bowlers. These dictate how much weight we put on each of the players' respective abilities when calculating their final rating. α and β each take a value from 0 to 1, where α represents how much weight we want to place on bowler economy, and β how much weight we want to place on batter strike rate.

Typically when running this ranking system we will provide three separate rankings each for bowlers and batters. A balanced ranking - where we place equal weight on both wickets and runs, and two weighted rankings, where we place more emphasis on wickets and runs respectively.