New Methods For Determining Batting Performance In Short Sequences of Games

Cricket Moneyball Two – assessing batting performance over a relatively short period of time.

During the course of an entire career the conventional statistical methods for determining batting prowess work reasonably well. We can for instance determine that with a career test average (Ave) of 99.94 Don Bradman was a half decent test batsman.

The problems arise when we are assessing player performance over a relatively short period of time, when we do not have a large number of innings to sample.  This can for instance become an issue when we are attempting to determine current short-term form, or a player’s performance in a given tournament.

There are a variety of potential problems here, varying game conditions for instance (more of this in a later post), but chief amongst these issues is the batsman who has a high number of not out scores which can distort his (or her) average. High numbers of not-outs may be down to the batsman’s innate brilliance, blind luck, or their position in the batting order, we cannot tell. This can lead to an erroneously high AVE which is calculated by dividing runs scored by times out (AVE=R/W). The most frequently cited example of this ‘not-out bias’ is the case of Lance Klusener who, in the 1999 World Cup scored 281 runs in nine innings while only being out twice.  This gave Lance an Average of 140.5, despite having a high score of only 52! Clearly a nonsense.

The first attempt to deal with this problem that I can find comes from ‘the two Alans,’ Alan Kimber and Alan Hansford (1) who attempted to draw on earlier work in survival analysis (Cox & Oakes 1984) and reliability analysis (Crowder, Kimber, Smith & Sweeting 1991) to produce a more rational means of batting performance indication.

I am reliably informed that Kimber & Hansford “argue against the geometric distribution and obtain probabilities for selected ranges of individual scores in test cricket using product-limit estimators…” (1)

No, I have no idea what that means either, so you will be relieved to know that others [Durbach (3) and Lemmer (4)] have since demonstrated that this system is almost as unreliable as AVE. So we can forget them and move on.

At this point our old friend H.H. Lemmer comes to our assistance again in (4) & (5) he argues that his analysis showes that if a not-out batsman had been allowed to bat on, he could reasonably expect to score twice the runs that he actually scored.  So logically, if we double the not out scores and count those innings as wickets we have a more accurate assessment, right? Well, not quite. Nothing is quite that simple in the wonderful world of cricket moneyball.

The formula derived by Lemmer from his insight is

e6 = (summout + 2.2-0.01 x avno) X sumno/n

where

n denotes number of innings played

sumout denotes the sum of out scores

sumno denotes the sum of not out scores

avno denotes the average of not out scores

However, if you were to simply double the not out scores and call that innings an ‘out’ you do end up with a very similar figure to e6.

 To put that into Lemmer’s parlance, the formula for this simpler method is

e2 = (sumout + 2 x sumno)/n

as you would expect.

Lemmer himself calls this ‘a good estimator’ and that’s good enough for me, this is the formula that I use for day in day out assessment of batting performance in single day games.

Coming to a spreadsheet near you.

There is one caveat, where there is one single very large not-out score the difference between e2 and e6 can become very large (>10), in which case we can use the measure e26 which is found by:

e26 = (e2 + e6)/2

o0o

Bibliography

1) Kimber, A.C. and Hansford, A.R. (1993) A Statistical analysis of batting in cricket. Journal of the Royal Statistical Society Series A 156 pp 443-455

2) Tim B. Swartz et al, (2006) Optimal Batting Orders in One day Cricket, Computers and Operations Research 33, 1939-1950

 

3) Ian Durbach et al (2007) On a Common Perception of a Random Sequence in Cricket South African Statistical Journal

 

4) Lemmer H.H. (2008) Measures of batting performance in a short series of cricket matches. South African Statistical Journal 42, pp 83-105

5) Lemmer H.H. (2008) An analysis of players’ performance in the first cricket Twenty/20 World Cup series. South African Journal For Research in Sport, Physical Education and Recreation 30 pp71-77

Cricket Moneyball 1 – Assessing Bowling

Do you recognize this situation?

You are a decent pace bowler; you open the bowling for your club, or perhaps come on first change. Today your team is playing against quality opposition, proper batsmen who don’t give their wicket away too easily, although rumor has it that they have a fairly long tail. Bowling well and working hard, you eventually pick up three wickets for twenty-three runs off eight overs (8 – 0 – 23 – 3) and the oppo are 103 for 6 off 24 overs.  So you are feeling quite pleased with yourself as you take a well-earned blow down on the fine-leg boundary.

But what’s this?

Sensing that the opposition batting lacks depth, the captain brings himself and his best mate on at the fall of the seventh wicket, and they proceed to clean up the tail-enders, with skipper skittling 9, 10 and 11 in four overs and finishing up with better figures than you at with 4 – 0 – 19 – 3.

“No fair”, you find yourself thinking, as you trudge disconsolately back to the pavilion, watching the team’s coterie of brown-noses slapping the captain on the back.

Well, fear not, change is at hand, Professor Hermanus H. Lemmer of the Department of Statistics, at the University of Johannesburg feels your pain.

Supposing we could find a way to attach a weight to the wickets taken depending on where the batsman was in the batting order? If we could do that a bowler would get more statistical credit for taking out Hashim Amla than Monty Panesar, and that can only be a good thing, (sorry Monty).

Professor Lemmer has done just that by producing a weighting for every position in every form of cricket – multi-day, 50 over and Twenty/20.

Here for example is the scale for 50 over games

Batting position Weight
1 1.30
2 1.35
3 1.40
4 1.45
5 1.38
6 1.18
7 0.98
8 0.79
9 0.59
10 0.39
11 0.19
Total 11.00

Yes, I’m afraid it still adds up to 11, you won’t be able to claim points for non-existent batsmen.

So for clarification Lemmer’s statistical analysis has shown that number 7 batsmen score on average 0.98/11ths of all runs scored in 50 over games, number 11 batsmen (i.e. people like me) have scored just 0.19/11ths.  So this scale gives a bowler seven times as much credit for taking out a number 3 or 4 batsman than a rabbit at 11.

So how to put this table to practical use?  Lemmer has designed the Combines Bowling Rate (CBR*) as a new measure of bowling performance, CBR* is calculated with the following formula.

CBR* = 3R/(W*+O+W*xR/B)

Where  R = runs conceded

W*= sum of weights of the batsmen out

O = overs bowled

B = balls bowled

So let’s apply the traditional AVE system and Lemmer’s CBR* methodology to our mythical situation and see what happens.

The Average (AVE) for you (Mr. Excellent Bowler) is 23/3 = 7.66

Whereas Mr. Meally-Mouthed skipper gets an average of 19/3 = 6.33

Plainly unfair.

Now imagine you had taken out batsmen 1, 3 and 5, so calculating CBR* using the formula above, the outcome for Mr. Excellent is now 4.92looks better already doesn’t it? And  the CBR* for the Skipper, (who took out 9, 10, and 11 remember) is now 9.5.

As with AVE the lower the figure for CBR* the better, so now justice has been seen to be done. Despite the fact that Skip picked up the same number of wickets in only half the overs bowled, your figures are convincingly better because you took out the quality batsmen.

Some bowlers will come out of this very well, when Professor Lemmer applied his CBR* rankings to the first Twenty/20 World Cup, Jimmy Anderson’s ranking improved from 28th to 22nd due to the high number of lower order batsmen he gets out, whereas Umar Gul slipped from 2nd to 5th for the opposite reason.

All we need now is someone to design an iPhone app to calculate CBR* and good bowlers everywhere will be happy.

Anyone?

Hello world!

This is my latest cricket blog, focussing exclusively on the more serious side of cricket, cricket information and cricket analysis.  It is quite deliberately serious, nerdy and wonkish.  If you want jokes go to my other blog.  I’m currently writing a series of articles entitled ‘Cricket Moneyball’ about new ways to assess cricket performance. First post should be up by end November 2012.