Tuesday, 7 December 2010

Statistics - not always black and white

I was a little startled by the front page of the Guardian this morning. It featured an article claiming that David Lammy, MP for Tottenham, had uncovered shocking evidence of racism in the admissions procedures of Oxford and Cambridge - Britain's two most prestigious universities.

Some of the figures are certainly cause to raise an eyebrow - just one black Briton of Caribbean descent accepted by Oxford last year? One college hasn't admitted a black student in five years? Surely this is evidence of institutionalised racism at its worst! Or is it? That one black Briton of Caribbean descent was of just 35 applicants, and a spokeswoman for Oxford points out that "black students apply disproportionately for the most oversubscribed subjects". This is before you start thinking about how many people don't disclose their ethnicity (on all the forms you're sent when you apply), and so on.

Clearly this is somewhere where some better statistical thinking would help, but it does not seem to be forthcoming. There are plenty of points that can be dissected and discussed, but I'm just going to pick on one quote (posted on this blog) from the honourable member, which I think highlights the quality of the analysis:

"Why is it that 25 of 84 Black applicants received offers from Keble College but just 5 of 64 Black applicants received offers from Jesus College over the same 11 year period?"

On face value, this seems quite a big difference. 25 out of 84 is 30%, whilst 5 out of 64 is just 8%. Surely that's not down to chance? That is presumably what we're supposed to think, but let's dig a little deeper. The Guardian have published the admissions data for each college, and it's from these that these two figures come. A quick look reveals that Lammy has picked out the two colleges with the highest and lowest rate of admission for black applicants, and this is when alarm bells should start ringing.

The sharpshooter fallacy is a classic. An old Texan (not a great shot) fires at the broad side of a barn, and then draws a big target whose centre is where his biggest cluster of shots happened to land. He points to this as proof of his superb marksmanship. This is an essential aspect of statistics - if you don't decide what you're looking for before you collect your data, it's easy to find results that seem implausible. If Lammy had had some reason to chose Keble and Jesus before he'd looked at the data, then the difference he highlights might mean something, but as it is could it just be down to chance?

Fortunately, it's a pretty easy thing to check. Let's assume that a black applicant has the same chance of being admitted to any Oxford college. On average, 22% of black applicants over the last 11 years were admitted, so I'll use this as my baseline. I'm going to simulate new versions of the real dataset, where I take the (real) total number of black applicants to each college, and then see how many get accepted by giving everyone a 22% chance. If we do this, then on average the college with the highest success rate admits 35% of its black applicants, whilst the average lowest success rate is 11% - a spread of 24%. David Lammy highlights a difference of 22% between best and worst as indicative of... something, when in fact it's pretty much what you'd expect.

There are doubtless plenty of valid issues - some of which Lammy does try to raise - that these data could highlight, had they been analysed properly and not obfuscated by a cloud of sensationalism. Lammy says that "the variations between colleges in their admissions statistics is a pertinent point and Oxbridge should be doing more to find out why such variations exist". Perhaps if he'd employed a statistician he would be able to answer this one for himself.

1 comment: