I just got passed this article from Nieman Lab, which looks at the way in which the Guardian quickly built a system to allow a crowdsourced analysis of the recently released MP Expense Claims in the UK.
Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets.
How do you catch up?
If you’re the Guardian of London, you wait for the associated public-records dump, shovel it all on your Web site next to a simple feedback interface and enlist more than 20,000 volunteers to help you find the needles in the haystack.
Its a superb piece of work, both from the Guardian’s perspective, and from the general public interest. The issue in this case, as far as I can see, is that the Telegraph’s headstart allowed them to find the juiciest stories, leaving the Guardian’s readers to dig out minor offences and amusing claims from the documentation. However, it certainly breaks new ground in terms of how extensive journalistic tasks can be achieved very quickly, if the publisher is prepared to be open with the process.
Over 20,000 readers have been actively involved in analysing 700,000 documents and highlighting the most meaningful data. It has certainly massively cut the cost of the research, whilst building and strengthening the Guardian community by engaging their readers beyond the normal reading/commenting interaction. It appears that only house advertising was shown alongside the crowdsourcing project, so the Guardian probably didn’t generate any direct revenue from the pageviews – but this was probably a decision by Guardians senior editors.
What other projects can be approached in this way? Well, it certainly provides a way of managing the huge amount of data that is being retrieved from public record requests. But as Michael Andersen points out, it really helps if users are pre-motivated to assist – with the public outcry about the Expenses Scandal providing a perfect scenario for such an approach. If more structure can be added to the data-finding process, thereby reducing the burden on individual contributors, then there is no reason why most public record investigations can be at least assisted in this way.