My previous post on data journalism might have conveyed the impression that I think it will cure all the problems of the press-release-rewriting style of journalism that readers of the Metro, for example, experience. Following several emails, I think I need to clarify.
I praised BBC Radio’s More or Less, but Matt Berkley emailed to criticise the programme’s feature on the World Bank’s global poverty stats, which he thinks “misleads in several important aspects”. Matt’s comment interested me (not least because I have, in another life, done some research on global poverty statistics), so I had another look. Feel free to read his complaint to the BBC and compare it to the published story, or the podcast.
Data doesn’t remove the room for debate, it just shifts the debate on to different territory. A data journalist will still make value judgements – but those should, where possible, be informed by statistical analysis, not an appeal to authority.
Now, attempting to report world poverty in a newspaper article sets the bar extremely high: even the meaning of the word “poverty” is a value judgement.
We can do better than “world poverty is decreasing because the World Bank says it is”, which is a simple appeal to authority: those guys are the experts, so they must be correct.
Given the world Bank report, journalists may ask:
- Why we pick a certain income level to indicate poverty? Even if we accept that far fewer people now live on $1.25 or less, there are almost as many people surviving on $2 or less as there were before. The poverty line may be defined as not starving, or not having some defined “basic needs” met, or not being among the poorest 20 per cent in your country. These are all different numbers, and all used by economists. Note: you can’t eradicate the last type of poverty, in case you were wondering.
- Whether we correct an arbitrary poverty line for the relative price of the things that poor people buy in different countries (also, how do we decide what those things are? The poor in different countries eat different food, and have different habits, which may make some parts of the world seem richer, when the quality of life is no better).
- Do we use a measure of earned income, or of what those people can eat or trade? The urban poor may have a bit more cash than the rural poor, but don’t have domestic animals, for example, so they might spend more but eat less. This is very difficult to measure.
- Most seriously, do the statistics use data to manipulate the headline? If you have done the rest of the analysis, this becomes clearer. Governments (or World Banks) are sometimes accused of picking a threshold, or a measurement process, to suit a carefully-chosen good news agenda.
An example of the final point: the government of Cynicalia wants to claim that it has abolished poverty, with the poverty line defined as $1.25 a day (as the World Bank defines it). There are a million working class Cynicalians earning on average $1 a day, and a million middle class Cynicalians earning on average $3 a day, and the president and his family earn $100,000 a day. It might squeeze the middle so that there are two million people earning $2 a day, while not redistributing the president’s wealth at all which is hidden in Switzerland. The government can now send a press release claiming that no one is poor, and that more than half the country is as well off, or better off, than before the reform.
A journalist can check the numbers of poor people at different poverty lines (maybe even using different measurements of income), investigate how the poverty line is calculated, or examine the effect of different redistribution policies. The figures exist, though working out how they were calculated can be a headache. All this takes time and some expertise, which is a problem.
Or the newspaper can just give up, and tell the journalist to repeat the government’s claim that Poverty is History. In which case that journalist is a loyal Cynic.
The article that Matt criticises covers many of the assumptions on poverty lines in some detail, and highlights their shortcomings. He feels the BBC should have done better.
I don’t agree with most of Matt’s complaint, for two editorial reasons. The first is that, where assumptions are made, I think they are clearly and accurately spelt out. The second is that this feature does not attempt to support a conclusion, merely to investigate how we calculate it (I also disagree with his analysis for a couple of economic reasons, but this is not the forum to air that discussion).
Data journalism is becoming trendy. I wish I’d written about Nate Silver in 2008, before I looked like a bandwagon jumper. But here’s the point: statistics do not resolve all arguments. A data journalist needs to understand how the data was collected, how it is presented, and whether the conclusions are justified by the data. The journalist also needs to resist overclaiming, based on a the emotional appeal of what the data seems to say.
I can show you plenty of examples of bad data journalism, where a little understanding can be as bad as none at all: I’ll leave it to you to ask.