Update on the Wikipedia sources project

Last month I presented the first results of the WikiSweeper project, an ethnographic research project to understand how Wikipedia editors track, evaluate and verify sources on rapidly evolving pages of Wikipedia, the results of which will inform ongoing development of the SwiftRiver (then Sweeper) platform. Wikipedians are some of the most sophisticated managers of online sources and we were excited to learn how they collaboratively decide which sources to use and which to dismiss in the first days of the 2011 Egyptian Revolution. In the past few months, I’ve interviewed users from the Middle East, Kenya, Mexico and the United States, studied hundreds of ‘talk pages’ from the article and analysed edits, users and references from the article, and compared these findings to what Wikipedia policy says about sources. In the end, I came up with four key findings that I’m busy refining for the upcoming report:

1.The source <original version of the article and its author> of the page can play a significant role: Wikipedia policy indicates that characteristics of the book, author and publishers of an article’s citations all affect reliability. But the 2011 Egyptian Revolution article showed how influential the Wikipedia editor who edits the first version of the page can be. Making Wikipedia editors’ reputation, edit histories etc more easily readable is a critical component to understanding points of view while editing and reading rapidly evolving Wikipedia articles.

2. Primary sources are gradually replaced by secondary sources: In the first hours/days of an event, primary sources (on-site journalists reporting from the field) are generally the only sources available since analysis of the events in context can only happen later. And so primary sources (including raw video footage from YouTube, references to live television footage etc) are used despite common assumption that Wikipedia does not allow primary sources (actually the policy states that primary sources are allowed but the majority of sources should be secondary sources). In the absence of secondary sources, editors must do the work of summarizing and highlighting the most important aspects of the events, something Wikipedians usually leave up to secondary sources.

3. The cite is not always the same as the source: For numerous reasons (including the fact that some editors feel that others are biased against more local sources, or because of the bias against social media sources etc) the citation that editors use to back up a particular phrase are not always the same as the source from which they receive their information. Editors might find their information from one source (for example, from Twitter or television) and then find another (what they perceive is a more acceptable) source for citation in the article.

4. The blurring of boundaries along traditional “reliable sources” lines: The evolution of this article shows how blogs (for example those by the BBC and Al Jazeerah) can host reliable secondary source authors, that YouTube can host reliably edited footage and that Twitter can provide access to authentic primary sources as a reference for how individuals reacted. Although there are cases of misinformation hosted on social media sources, the “social media” category does not necessarily relay reliability information. Because Wikipedia policy doesn’t directly address the multiple contexts in which “social media sources”  can host reliable information, a number of questions are still prevalent, which is probably why social media source queries dominate the Wikipedia Reliable Sources Noticeboard.

My design recommendations include the design of source management systems around the kind of collaboration that is already working on Wikipedia: where editors collaborate around specific news stories, checking to see whether the source actually reflects the information in the article, whether the source is accurately contextualized, whether other media verify the facts in the article and whether there is any accompanying multimedia.

Below you’ll find my slides. A video of the presentation is available on Vimeo. The full report will be available in a few weeks but I look forward to your responses in the meantime!

Comments are closed.