The other day I accidentally caught a few seconds of one of those tedious phone-in political discussion shows on the radio. The discussion was obviously about the economy, because I heard the caller say;
“Chancellor’s of the exchequer are like cricket captains, you are better off with a lucky one than a good one.”
Or words to that effect. This set me thinking, what elements of luck are there in a captain’s cricketing career and how can a skipper be judged by his or her luck?
The first element of luck in cricket must surely be the coin toss.
I had always presumed that there was a 50-50 chance of a coin landing on ‘Tails’, which is why, in my short career as captain I adopted the ‘tails never fails’ philosophy, which, as it turned out failed about half the time. However recent Canadian research shows that coin tosses are anything but random, and that the 50-50 outcomes are a myth.
Matthew Clark and Dr. Brian Westerberg at the University of British Columbia in Vancouver, Canada, asked thirteen medical students to flip a coin 300 times and try to influence the way it landed, cash prizes were awarded to the students who could make the coin land on ‘Heads’ most often. After just two minutes’ practice, the students could make the coin land on the side they wanted 54% of the time. One of the participants achieved heads a startling 68% of the time.
The simplest method of toss manipulation (and the one most relevant to cricket) is simply to note which side of the coin that is uppermost before it is flipped, as this side is 57% more likely to land facing upwards, they found. This is because discs do not spin symmetrically in flight.
But by far the biggest influence on which side the coin lands is the height, the angle of launch and the catch. By practicing to gain consistency, the tosser can have a significant affect on the outcome up to a 68% success rate.
Other studies have suggested that a Belgian €1 coin is significantly heavier on one side of the coin than the other which in theory would give more heads than tails. However Clark and Westerberg demonstrated that this effect was no more pronounced than on the more routine cheating manipulation demonstrated here.
“The findings of my research show, to statistical significance, that it is easy to manipulate the toss of a coin”, said Clark.
This suggests that when Nasser Hussein lost the toss on fourteen consecutive occasions (16384 to 1, if you are interested), rather than being unlucky he was just not practicing enough.
1. How random is the toss of a coin?
Matthew P.A. Clark, MBBS and Brian D. Westerberg, MD
From St. Paul’s Rotary Hearing Clinic, University of British Columbia, Vancouver, BC
2. Murray DB, Teare SW. Probability of a tossed coin falling on its edge. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1993;48:2547–52. [PubMed]
3. Diaconis P, Homes S, Montgomery R. Dynamical bias in the coin toss. SIAM Rev. 2007;49:211–35.
4. MacKenzie D. Euro coin accused of unfair flipping. New Sci. 2002. Jan 4, [(accessed 2009 Oct. 22)]. Available: www.newscientist.com/article/dn1748-euro-coin-accused-of-unfair-flipping.html.
5. Denny C, Dennis S. Heads, Belgium wins — and wins. The Guardian; [UK]: 2002. Jan 4, [(accessed 2009 Oct. 22)]. Available: www.guardian.co.uk/world/2002/jan/04/euro.eu2.
6. Kosnitzky G. Murphy’s Magic Supplies. Rancho Cordova (CA): 2006. Heads or tails; p. 6.
Twenty20 cricket gives us a whole new set of problems when it comes to assessing batting performance. Upper order batsmen are much more likely to achieve ‘not out’ scores in Twenty20 and we have seen in the previous post how that can render the traditional Average score (Ave) close to useless.
In Twenty20 scoring rate is of critical importance, a batsman who scores 30 runs off 10 balls, than one who scores 31 off 30. So to accurately reflect the quality of a batting performance we need a measure that takes in both runs scored and the rate at which they are scored.
To further complicate matters, runs are scored at very different rates under very different conditions. It is much easier to score runs quickly on a track with true and predictable bounce under a bright blue sky, than on a green strip in Manchester, or a dead track in Nagpur. So if possible we need the measure to take into account conditions on the day.
Croucher (2000) was the first researcher I have found to deal with the first of these problems. He proposed something he called the “Batting Index” (BI). Which was found by simply multiplying the conventional average by the strike rate per 100 balls faced.
BI = AVE X SR
AVE = R/out
R= runs scored
Out = times out
SR = 100 X R/B
B = balls faced.
Basevi & Binoy (2007) used a very similar measure, which they called CALC
CALC = R2/(out X B)
Now if you work this out (bear with me here, I only just scraped a low grade ‘A’ level in maths and that was a long time ago), you get
CALC = (R/out) X (R/B)
In other words this is just AVE X SR again, the difference between that this is runs per ball rather than runs/100 balls, or to put it another way CALC = BI/100.
The general feeling among researchers was that this method of simply multiplying average by strike rate over-emphasized the value of strike rate in Twenty/20 games. Secondly it did not take into account different batting conditions. So how do we account for differing playing conditions when we are assessing a batsman’s performance? One suggestion (Lemmer 2008) is to take the average scoring rate fro all batsmen and compare with that figure. For example if the average run-rate at one particular ground or in one tournament was 124, we could assess an individual players performance against that figure. This would then give us a good idea of how that individual was performing. The formula for BP (Batting Performance) is as follows;
BP26 = e26XRP=e26X(SR/AVSR)0.5
E26 = (e2 + e6)/2
E2 = (sumout + 2Xsumno)/n
E6 = (sumout + f6 X sumno)/n
F6 = 2.2-0.01Xavno
AVSR = average strike-rate
SR = Strike Rate
In international Twenty20 matches AVSR = 124.03, so that figure can be substituted in to the formula.
It is clearly unfair to compare batting performance in differing batting conditions. So by comparing his performance with the average strike rate of all batsmen playing in those conditions are fairer assessment can be made.
Again, Excel is your friend here, it is an initially daunting looking formula, but once you have the formula set up in your spreadsheet, the inputs can be added quickly and the result attained satisfyingly quickly.
Croucher, J.S. (2000) ‘Player Ratings In One Day Cricket’. Proceedings of the Fifth Australian Conference on Mathematics and Computers in Sport Eds. Cohen G. & Langtry, T. Sydney University of Technology, NSW. 95-106
Basevi, T. & Binoy, G. (2007) ‘The World’s Best Twenty20 Players’ Cricinfo cricinfo.com/columns/content/story/311962.html
Lemmer, H.H. (2008) ‘An Analysis of Players’ Performances in the first Cricket Twenty20 World Cup’. South African Journal For Research In Sport, Physical education and recreation 2008, 3092): 71-77
Lemmer, H.H. (2011) ‘The Single Match Approach to Strike Rate Adjustments in Batting Performance Measures in Cricket’ Journal of Sports Science and Medicine 10, 630-634
Cricket Moneyball Two – assessing batting performance over a relatively short period of time.
During the course of an entire career the conventional statistical methods for determining batting prowess work reasonably well. We can for instance determine that with a career test average (Ave) of 99.94 Don Bradman was a half decent test batsman.
The problems arise when we are assessing player performance over a relatively short period of time, when we do not have a large number of innings to sample. This can for instance become an issue when we are attempting to determine current short-term form, or a player’s performance in a given tournament.
There are a variety of potential problems here, varying game conditions for instance (more of this in a later post), but chief amongst these issues is the batsman who has a high number of not out scores which can distort his (or her) average. High numbers of not-outs may be down to the batsman’s innate brilliance, blind luck, or their position in the batting order, we cannot tell. This can lead to an erroneously high AVE which is calculated by dividing runs scored by times out (AVE=R/W). The most frequently cited example of this ‘not-out bias’ is the case of Lance Klusener who, in the 1999 World Cup scored 281 runs in nine innings while only being out twice. This gave Lance an Average of 140.5, despite having a high score of only 52! Clearly a nonsense.
The first attempt to deal with this problem that I can find comes from ‘the two Alans,’ Alan Kimber and Alan Hansford (1) who attempted to draw on earlier work in survival analysis (Cox & Oakes 1984) and reliability analysis (Crowder, Kimber, Smith & Sweeting 1991) to produce a more rational means of batting performance indication.
I am reliably informed that Kimber & Hansford “argue against the geometric distribution and obtain probabilities for selected ranges of individual scores in test cricket using product-limit estimators…” (1)
No, I have no idea what that means either, so you will be relieved to know that others [Durbach (3) and Lemmer (4)] have since demonstrated that this system is almost as unreliable as AVE. So we can forget them and move on.
At this point our old friend H.H. Lemmer comes to our assistance again in (4) & (5) he argues that his analysis showes that if a not-out batsman had been allowed to bat on, he could reasonably expect to score twice the runs that he actually scored. So logically, if we double the not out scores and count those innings as wickets we have a more accurate assessment, right? Well, not quite. Nothing is quite that simple in the wonderful world of cricket moneyball.
The formula derived by Lemmer from his insight is
e6 = (summout + 2.2-0.01 x avno) X sumno/n
n denotes number of innings played
sumout denotes the sum of out scores
sumno denotes the sum of not out scores
avno denotes the average of not out scores
However, if you were to simply double the not out scores and call that innings an ‘out’ you do end up with a very similar figure to e6.
To put that into Lemmer’s parlance, the formula for this simpler method is
e2 = (sumout + 2 x sumno)/n
as you would expect.
Lemmer himself calls this ‘a good estimator’ and that’s good enough for me, this is the formula that I use for day in day out assessment of batting performance in single day games.
Coming to a spreadsheet near you.
There is one caveat, where there is one single very large not-out score the difference between e2 and e6 can become very large (>10), in which case we can use the measure e26 which is found by:
e26 = (e2 + e6)/2
1) Kimber, A.C. and Hansford, A.R. (1993) A Statistical analysis of batting in cricket. Journal of the Royal Statistical Society Series A 156 pp 443-455
2) Tim B. Swartz et al, (2006) Optimal Batting Orders in One day Cricket, Computers and Operations Research 33, 1939-1950
3) Ian Durbach et al (2007) On a Common Perception of a Random Sequence in Cricket South African Statistical Journal
4) Lemmer H.H. (2008) Measures of batting performance in a short series of cricket matches. South African Statistical Journal 42, pp 83-105
5) Lemmer H.H. (2008) An analysis of players’ performance in the first cricket Twenty/20 World Cup series. South African Journal For Research in Sport, Physical Education and Recreation 30 pp71-77
Do you recognize this situation?
You are a decent pace bowler; you open the bowling for your club, or perhaps come on first change. Today your team is playing against quality opposition, proper batsmen who don’t give their wicket away too easily, although rumor has it that they have a fairly long tail. Bowling well and working hard, you eventually pick up three wickets for twenty-three runs off eight overs (8 – 0 – 23 – 3) and the oppo are 103 for 6 off 24 overs. So you are feeling quite pleased with yourself as you take a well-earned blow down on the fine-leg boundary.
But what’s this?
Sensing that the opposition batting lacks depth, the captain brings himself and his best mate on at the fall of the seventh wicket, and they proceed to clean up the tail-enders, with skipper skittling 9, 10 and 11 in four overs and finishing up with better figures than you at with 4 – 0 – 19 – 3.
“No fair”, you find yourself thinking, as you trudge disconsolately back to the pavilion, watching the team’s coterie of brown-noses slapping the captain on the back.
Well, fear not, change is at hand, Professor Hermanus H. Lemmer of the Department of Statistics, at the University of Johannesburg feels your pain.
Supposing we could find a way to attach a weight to the wickets taken depending on where the batsman was in the batting order? If we could do that a bowler would get more statistical credit for taking out Hashim Amla than Monty Panesar, and that can only be a good thing, (sorry Monty).
Professor Lemmer has done just that by producing a weighting for every position in every form of cricket – multi-day, 50 over and Twenty/20.
Here for example is the scale for 50 over games
Yes, I’m afraid it still adds up to 11, you won’t be able to claim points for non-existent batsmen.
So for clarification Lemmer’s statistical analysis has shown that number 7 batsmen score on average 0.98/11ths of all runs scored in 50 over games, number 11 batsmen (i.e. people like me) have scored just 0.19/11ths. So this scale gives a bowler seven times as much credit for taking out a number 3 or 4 batsman than a rabbit at 11.
So how to put this table to practical use? Lemmer has designed the Combines Bowling Rate (CBR*) as a new measure of bowling performance, CBR* is calculated with the following formula.
CBR* = 3R/(W*+O+W*xR/B)
Where R = runs conceded
W*= sum of weights of the batsmen out
O = overs bowled
B = balls bowled
So let’s apply the traditional AVE system and Lemmer’s CBR* methodology to our mythical situation and see what happens.
The Average (AVE) for you (Mr. Excellent Bowler) is 23/3 = 7.66
Whereas Mr. Meally-Mouthed skipper gets an average of 19/3 = 6.33
Now imagine you had taken out batsmen 1, 3 and 5, so calculating CBR* using the formula above, the outcome for Mr. Excellent is now 4.92 – looks better already doesn’t it? And the CBR* for the Skipper, (who took out 9, 10, and 11 remember) is now 9.5.
As with AVE the lower the figure for CBR* the better, so now justice has been seen to be done. Despite the fact that Skip picked up the same number of wickets in only half the overs bowled, your figures are convincingly better because you took out the quality batsmen.
Some bowlers will come out of this very well, when Professor Lemmer applied his CBR* rankings to the first Twenty/20 World Cup, Jimmy Anderson’s ranking improved from 28th to 22nd due to the high number of lower order batsmen he gets out, whereas Umar Gul slipped from 2nd to 5th for the opposite reason.
All we need now is someone to design an iPhone app to calculate CBR* and good bowlers everywhere will be happy.
This is my latest cricket blog, focussing exclusively on the more serious side of cricket, cricket information and cricket analysis. It is quite deliberately serious, nerdy and wonkish. If you want jokes go to my other blog. I’m currently writing a series of articles entitled ‘Cricket Moneyball’ about new ways to assess cricket performance. First post should be up by end November 2012.