'The ultimate mission is free knowledge'

Interview Arthur Richards
Arthur Richards is a software engineer for Wikipedia

How does Wikipedia handle a whopping 700,000 donations from 150 countries?

Linux Format magazine caught up with Arthur Richards of the Wikimedia Foundation to find out.

interview

LXF: A few years ago, when making the DVD for our magazine, we wanted to put some kind of Wikipedia snapshot on the disc – a subset of articles. We found a few random database dumps, but nothing official...

AR: I'm working on that. There's a small suite of tools that exists on the system called Toolserver, which is just a server that some of our community members have access to, for bots and scripts to mine data from Wikipedia.

Someone has written a tool that basically goes through all the Wikipedia articles and parses assessment data that exists for a lot of articles. You won't see them on every article, but on those that are very carefully watched by a certain project or group.

So you can go into this tool and say: I want to see all the B-or-above rated articles about reptiles, and get back a whole big list. Then you can go through and carefully select specific revisions, and ultimately take those articles and export them to a CSV file, which you can feed through a crazy home-grown system that some guy has, that will turn it into an openZIM file. That's a highly compressed data storage format, that you can then load into an openZIM reader, and basically have Wikipedia at your fingertips.

You can search through articles, but it's read-only at the moment. I'm actually mentoring a Google Summer of Code student who's taking those tools on the Toolserver, and porting them over to a MediaWiki extension, which will then allow people to build their own collections of articles.

Ideally, once it's actually out there as an extension, other people will be able to pick it up and expand it. We'd like people to be able to build their own custom libraries of Wikipedia articles – specific revisions and the like that they can then share with other people.

You could then take someone else's collection, amend it to make it bigger or smaller, apply certain filters to it – such as making it child-safe for instance – and ultimately be able to export those groupings of articles into some kind of offline format. That's the long-term vision.

LXF: What area really interests you in the future of Wikimedia?

AR: I'm not sure... I like it all. One of the cool things about working for the Wikimedia Foundation is it's like being a kid in a candy store. There's so much you can do, and so much that needs to be done, and not that many people doing it. You get to explore and touch lots of different aspects.

LXF: I guess that's why you get involved with a project like this in the first place – you know it's not going to be like working on a production line.

AR: Exactly. At the end of the day, the ultimate mission is free knowledge – you can do anything to further that, and that's the passion that drives almost everybody who's involved in it.

--------------------------------------------------------------------------------------------------

First published in Linux Format Issue 153

Liked this? Then check out Inside the Free Geek non-profit hardware emporium

Sign up for TechRadar's free Week in Tech newsletter
Get the top stories of the week, plus the most popular reviews delivered straight to your inbox. Sign up at http://www.techradar.com/register

Follow TechRadar on Twitter * Find us on Facebook * Add us on Google+