Could big data determine who wins the General Election?

Nothing is being left to chance in the UK's first truly data-driven vote

Political science is, at last, becoming a 'proper' science. Data analysis techniques are spreading throughout the political class as big data is used in this year's General Election like never before.

No longer are those gaggles of people going door-to-door on a hit-and-miss mission – if there are party activists in your area, then you can bet that an analytical engine has told them that your constituency, your area and your street is of critical importance in the outcome of the election. Is this a new approach? Is it too cynical an approach? Only you can decide, but just know that whichever way you vote on May 7, some software has probably already predicted it.

"In previous elections it was the spin doctors and the media that influenced the election outcome," says Mark Morley, Director of Industry, OpenText. "Perhaps in 2015 it will be the data scientists who have the most influence."

Mark Morley Director of Industry OpenText

Mark Morley, Director of Industry, OpenText

Is this the UK's first truly data-driven vote?

"Data has been used for decades to understand voters' preferences and habits, though this is the first time political parties are using it in earnest to communicate," says Jed Mole, European Marketing Director at Acxiom.

Big data and data analytics have become much more mature since 2010. "Political parties have learnt that it is not just how you gather and archive information that counts, but how you use it to develop an action plan and strategy," notes Morley, who says it's all about taking advantage of digital information to react strategically in near real-time.

Which analytical engines are being used?

Analytical engines are being used by all three main political parties; the Lib Dems use one called Contact Creator, Labour has Voter ID and the Tories use Merlin. They calculate the likelihood of voters choosing each party in every UK constituency, though the raw data behind all of them comes from the Mosiac database of UK demographics.

"Mosiac was one of the first segmentation classifications in the UK," says Mole, who explains that it's stereotyping based on information. On its own it's a really blunt tool. For instance, it will use someone's postcode to determine the type of voter they are likely to be – where you live determines whether you're communicated to as 'Mondeo Man', or not. "Individuals, regardless of whether they are voters, are now far more complex and need to be understood and communicated to on an individual level," says Mole.

Mosaic is the base layer of information needed to make predictions about the voting population, but as we've become more diverse and unpredictable, a new layer of analytics is being used. It was pioneered, perhaps not surprisingly, in the USA.

Sean Owen Director of Data Science at Cloudera

Sean Owen, Director of Data Science at Cloudera

Is this the 'Obama-isation' of politics?

"If we use the US 2012 presidential election as the yardage stick for a truly data-driven campaign, then it doesn't seem like we've yet seen the same from a major UK campaign," says Sean Owen, Director of Data Science, at big data analytics firm Cloudera. "I believe this will be the first General Election when we can point to a decisive effect from analytics."

In the US 2012 presidential election the Obama campaign used analytics to centralise scraps of information from campaigns, and merge them with demographic databases to understand voters. "Analytics was used to precisely identify 'swing' voters who are receptive to a political message, and intelligently buy media time that precisely reaches them – i.e. many, cheap, focused TV slots, not simply a couple of expensive prime-time slots," says Owen.

However, the UK hasn't reached that stage yet. "The UK's political parties are nowhere near as advanced in their use of analytics platforms and databases compared to the US," says Mole. "The main UK parties will of course have access to proven and industry-leading analytical software such as SAS, SPSS, but the most important part of any insight-led analytics campaign is the data that goes into it."

"Imagine your analytical software as an engine; the data is the fuel that drives it, and it simply wouldn't run without it.