How the hell does Wolfram Alpha work?

WolframAlpha
WolframAlpha is extremely clever, but how does it work?

As it's generated more hype than swine flu you're probably already familiar with Wolfram Alpha, a powerful new knowledge engine that promises to be the ultimate in online encyclopaedias. (You're not? Take a look at our Wolfram Alpha review.)

What you might not have read about is how Wolfram Alpha works, and exactly what's going on behind the scenes of the rather unassuming web site. And we're just about to tell you.

Massive database

It all starts with the database. Wolfram Alpha has access to many trillion elements of data covering topics from maths to nutrition, physics to music, weather to anagrams.

Some of this data arrives in real time. Ask for a share price, for instance, and you'll get a value that's no more than a minute old (if the relevant exchange allows it, anyway).

STOCK INFO: Wolfram Alpha has extensive data on stocks and shares going back many years

But most data is input through a more complex, part automated, part manual system. The first step comes in choosing sources. There's no general automatic input from the web here: instead Wolfram Alpha staff work with experts in different domains to decide which sources are the best.

This has produced some impressive results, the company doing special deals with the owners of proprietary databases that they believe are important, delivering access to information that wasn't previously available online.

The data then goes through an automated procedure to clean and check it. And after that it's verified by real-life experts (some on the Wolfram Alpha staff, some outside) to confirm that it all seems reliable.

This all seems rigorous enough, and you can certainly understand the need to be careful. After all, just one or two stories of inaccuracies in Wolfram Alpha would be enough to undermine its reputation.

But what if you're researching something where there's disagreement, like how dangerous it is to breath in second hand smoke, or the number of civilian casualties in Iraq in the past few years? Here the source is everything. Wolfram Alpha will tell you where its data comes from in any response, but if it doesn't use a good range of sources then you may not get the full picture.

Mathematica

A large database is just the start. The real value in Wolfram Alpha comes from how all this information can be organised and related.

Take colours, for example. Wolfram Alpha's creators taught the system various facts about "red", including that it's represented by the code #FF0000 in HTML. But they also added an additional algorithm explaining that two colours can be combined to produce a third, which is why entering a query like "red + yellow" will display orange as a result.

The process continued with the creation of many more algorithms, each providing a little more real world understanding of how data can be used. And the complete set were then implemented in Mathematica, another Stephen Wolfram project, using 5 to 6 million lines of program code.

If you're trying to create models that cover all real world knowledge then five to six million lines really isn't a lot, but in part this is because the Mathematica programming language offers many built-in shortcuts.

DISPLAYING DATA: Mathematica can display items of data, graphs and more, all in just one line of code

Curious about the weather, for instance? A Mathematica program can access it in a single line like "In[1]:=WeatherData["Leeds", "Temperature"]". Microsoft share price? "FinancialData["MSFT"]". The names of Neptune's moons? "AstronomicalData["Neptune", "Satellites"]"

And there's much more. Mathematica lies at the heart of Wolfram Alpha, so if you're keen on finding out more about how it works then it's worth taking a look at the language. We'd recommend starting on the official Mathmatica Documentation Centre.

Mike Williams
Lead security reviewer

Mike is a lead security reviewer at Future, where he stress-tests VPNs, antivirus and more to find out which services are sure to keep you safe, and which are best avoided. Mike began his career as a lead software developer in the engineering world, where his creations were used by big-name companies from Rolls Royce to British Nuclear Fuels and British Aerospace. The early PC viruses caught Mike's attention, and he developed an interest in analyzing malware, and learning the low-level technical details of how Windows and network security work under the hood.