Nate Silver: hard to believe this man is a statistician
For the second US election in a row, the winner is a guy called Nate Silver, who might be the future of intelligent journalism. He rescues us from the tyranny of columnists who simply write about the comments of their own heads.
Nate blogs at fivethirtyeight.com, which is, since 2010, a New York Times blog. He analyses opinion polls, but he does it very, very well. He is entertaining and readable, even if you don’t care who just won the election in the US.
I discovered Nate’s analysis by accident in 2008 when I was looking for some statistics to undermine one of the nuttier blog opinions by data-lite controversialist Melanie Phillips (which made it very nutty indeed). Fivethirtyeight has a rigour that journalism seems to have mislaid in the internet era in a search for sensation. He does a seemingly simple thing extremely well: when an opinion poll is released, he adds it to a model which creates an aggregate. If the model is well-constructed, this has smaller margins for error and less chance of systematic bias. It is more likely to reflect the true state of the world.
The clever part is that he doesn’t just produce an average. He weights the polls, depending on their sample size, the way the information was obtained, the historical accuracy of the polling company, when it was conducted, the exact question that was asked, and so on. He looks for statistical bias – a consistent under- or over-reporting of candidate’s popularity. He adjusts his own model if he finds evidence that it is biased. Importantly, he writes nerdy blog posts about what he is doing, explaining his reasoning, and pointing out possible flaws in his work.
The result is that “outliers” – polls that, through random sampling, produce a freak result – have little importance on Fivethirtyeight – while on the internet and the news channels they tend to dominate the agenda, albeit fleetingly. This means his reporting is less shouty, but it has proved to be stunningly accurate for two elections in a row: at the time of writing, his analysis has correctly predicted the result in every state for the 2012 US presidential election, and the electoral college vote too.
Having a model doesn’t necessarily mean you will be correct – there are plenty of other statistical models which predicted the election less accurately. Fivethirtyeight carefully spells out the steps in its analytical process (though not the precise parameters of the model), so we can make an informed judgement on the quality of the findings. Any model is open to criticism from other statisticians - but this means they can have an adult, public conversation about what might be improved, or what the impact of a flaw in the analysis might be. We can learn from this, too.
This wouldn’t be important if it was just a different way to present the same news; but this type of analysis creates fresh insight. By polling day 2012, the model predicted a greater than 90 per cent chance of an Obama victory; and yet organisations like the BBC and the FT were using lazy phrases like “too close to call” and “on a knife edge”. If newspapers are prepared to do this type of analysis routinely, I suggest, it offers huge potential for creating an open, analytical type of serious journalism led by numbers and observed reality, not opinions.
Old jokes department: “And what do you do?”
Not every journalist can be a stats geek, though I think they should have more compulsory education in how to interpret data, and would prefer that newspapers enforced an in-house ban on reporting surveys that are statistical nonsense – which, in my experience, is most of them (I’ve written those survey-based articles in the past, and reported lots of rubbish data as if it were spotless, which I regret).
Newspapers and magazines are cutting back on conventional journalism. Budgets are tight. It’s probably too much to hope that we can create a new type of data-journalist, or that newspapers will suddenly grow a statistical conscience. It needn’t be expensive: a laptop and some specialist software is perfectly adequate to do the statistical research that can validate the claims that powerful people make. It’s the job of the media to investigate these claims – not just talk to one person who agrees, and another who disagrees. On Radio 4, More or Less does an entertaining job of validating reported statistics (download the podcasts, they are excellent). Ben Goldacre’s Bad Science posts are also a model of this approach.
It’s patronising to assume that readers can’t cope with statistical analysis. Clearly, many don’t like it, and some misunderstand it; but that’s true of any type of journalism that goes beyond the obvious. The conclusions (especially those that go against gut feel or conventional wisdom) may be unpopular: just read the critical comments on Nate Silver’s blog. It’s also true that science isn’t the last word on a subject, just a powerful way of testing an assumption. Statistics involves making value judgements in how you treat the numbers, in the same way as a journalist makes a judgement about how much credibility to give any source. But in statistics there is the opportunity to be explicit about those judgements, and then go where the numbers take us.
This type of insight is a fundamental tool, in an increasingly complex world, if we want to make informed decisions. The alternative is to just place trust in the conclusions of “experts”, of which there seem to be an ever-increasing number quoted on TV or in newspapers.
I’ll leave the conclusion to one of Nate’s commenters, who explains it better than I do:
Rather than cheer for Nate because we all like his Obama forecasts, how about cheering for him because he might believe in a world where numbers and rational analysis are vital to how we make decisions, even in those cases where we don’t like what the numbers imply?… It’s not about hoping you will win at Vegas. It’s about understanding how the Vegas game works.