Nate Silver’s numbers game

Nate Silver: hard to believe this man is a statistician

For the second US election in a row, the winner is a guy called Nate Silver, who might be the future of intelligent journalism. He rescues us from the tyranny of columnists who simply write about the comments of their own heads.

Nate blogs at fivethirtyeight.com, which is, since 2010, a New York Times blog. He analyses opinion polls, but he does it very, very well. He is entertaining and readable, even if you don’t care who just won the election in the US.

I discovered Nate’s analysis by accident in 2008 when I was looking for some statistics to undermine one of the nuttier blog opinions by data-lite controversialist Melanie Phillips (which made it very nutty indeed). Fivethirtyeight has a rigour that journalism seems to have mislaid in the internet era in a search for sensation. He does a seemingly simple thing extremely well: when an opinion poll is released, he adds it to a model which creates an aggregate. If the model is well-constructed, this has smaller margins for error and less chance of systematic bias. It is more likely to reflect the true state of the world.

The clever part is that he doesn’t just produce an average. He weights the polls, depending on their sample size, the way the information was obtained, the historical accuracy of the polling company, when it was conducted, the exact question that was asked, and so on. He looks for statistical bias – a consistent under- or over-reporting of candidate’s popularity. He adjusts his own model if he finds evidence that it is biased. Importantly, he writes nerdy blog posts about what he is doing, explaining his reasoning, and pointing out possible flaws in his work.

The result is that “outliers” – polls that, through random sampling, produce a freak result – have little importance on Fivethirtyeight – while on the internet and the news channels they tend to dominate the agenda, albeit fleetingly. This means his reporting is less shouty, but it has proved to be stunningly accurate for two elections in a row: at the time of writing, his analysis has correctly predicted the result in every state for the 2012 US presidential election, and the electoral college vote too.

Having a model doesn’t necessarily mean you will be correct – there are plenty of other statistical models which predicted the election less accurately. Fivethirtyeight carefully spells out the steps in its analytical process (though not the precise parameters of the model), so we can make an informed judgement on the quality of the findings. Any model is open to criticism from other statisticians – but this means they can have an adult, public conversation about what might be improved, or what the impact of a flaw in the analysis might be. We can learn from this, too.

This wouldn’t be important if it was just a different way to present the same news; but this type of analysis creates fresh insight. By polling day 2012, the model predicted a greater than 90 per cent chance of an Obama victory; and yet organisations like the BBC and the FT were using lazy phrases like “too close to call” and “on a knife edge”. If newspapers are prepared to do this type of analysis routinely, I suggest, it offers huge potential for creating an open, analytical type of serious journalism led by numbers and observed reality, not opinions.

Old jokes department: “And what do you do?”

Not every journalist can be a stats geek, though I think they should have more compulsory education in how to interpret data, and would prefer that newspapers enforced an in-house ban on reporting surveys that are statistical nonsense – which, in my experience, is most of them (I’ve written those survey-based articles in the past, and reported lots of rubbish data as if it were spotless, which I regret).

Newspapers and magazines are cutting back on conventional journalism. Budgets are tight. It’s probably too much to hope that we can create a new type of data-journalist, or that newspapers will suddenly grow a statistical conscience. It needn’t be expensive: a laptop and some specialist software is perfectly adequate to do the statistical research that can validate the claims that powerful people make. It’s the job of the media to investigate these claims – not just talk to one person who agrees, and another who disagrees. On Radio 4, More or Less does an entertaining job of validating reported statistics (download the podcasts, they are excellent). Ben Goldacre’s Bad Science posts are also a model of this approach.

It’s patronising to assume that readers can’t cope with statistical analysis. Clearly, many don’t like it, and some misunderstand it; but that’s true of any type of journalism that goes beyond the obvious. The conclusions (especially those that go against gut feel or conventional wisdom) may be unpopular: just read the critical comments on Nate Silver’s blog. It’s also true that science isn’t the last word on a subject, just a powerful way of testing an assumption. Statistics involves making value judgements in how you treat the numbers, in the same way as a journalist makes a judgement about how much credibility to give any source. But in statistics there is the opportunity to be explicit about those judgements, and then go where the numbers take us.

This type of insight is a fundamental tool, in an increasingly complex world, if we want to make informed decisions. The alternative is to just place trust in the conclusions of “experts”, of which there seem to be an ever-increasing number quoted on TV or in newspapers.

I’ll leave the conclusion to one of Nate’s commenters, who explains it better than I do:

Rather than cheer for Nate because we all like his Obama forecasts, how about cheering for him because he might believe in a world where numbers and rational analysis are vital to how we make decisions, even in those cases where we don’t like what the numbers imply?… It’s not about hoping you will win at Vegas. It’s about understanding how the Vegas game works.

About these ads

2 Responses to “Nate Silver’s numbers game”


  1. 1 mattberkley November 12, 2012 at 9:00 am

    “It’s hard to think of a more important figure than the number of people living in poverty.”

    – BBC radio programme More or Less, 3 March 2012.

    “It’s the job of the media to investigate these claims – not just talk to one person who agrees, and another who disagrees. On Radio 4, More or Less does an entertaining job of validating reported statistics (download the podcasts, they are excellent).”

    – Tim Phillips.

    The programme misleads in several important aspects. A complaint was sent:

    \http://www.mattberkley.com/bbc.txt


  1. 1 The dangers of data journalism « talk normal Trackback on November 13, 2012 at 10:23 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Cut out your waffle: buy my book

Type your email and click the button and you will automatically get every new post.

“This excellent collection” (Director Magazine). Click to order:

I tweet

Error: Twitter did not respond. Please wait a few minutes and refresh this page.


Follow

Get every new post delivered to your Inbox.

Join 987 other followers