Twitter mines: at the digital coalface

Twitter logo
Twitter has provided an unimaginably rich seam of data to tap into

In 1849, in a desperate attempt to stop local miners rushing to California to join the new gold rush, Dr M.F Stephenson, head of the local Mint, is famously misquoted as saying "There's gold in them thar hills" as he pointed to the surrounding peaks in the town of Dahlonega, Georgia.

In actual fact, he said "There's millions in it", and the real quote is as true today as it was over 150 years ago: if you find the right mine, there are millions, even billions to be made from it.

But in the 21st Century, the mines aren't dug in the ground, and the miners are more likely to have PhD's than a pickaxe and dirty fingernails. Today, we are mining data.

Big data, big mines

You would have to have been living in a literal hole in the ground not to have heard phrases like "big data" and "big analytics" being bandied around. It is a brave new frontier of technology where huge volumes of data are sifted to try to analyse every element of human behaviour, and if possible, to predict how people will react. And if you can predict the future, there is money to be made in it.

In fact, this idea of data mining is not as new as the big data companies would have you believe. For years, retailers have been trying to draw parallels between how different people buy goods to attempt to transfer the knowledge they have gained from one consumer, and apply it to another.

How long have you had a supermarket loyalty card? I've had one for almost 20 years. And as Tesco's chairman said after the first trial of the Clubcard loyalty scheme: "What scares me about this is that you know more about my customers after three months than I know after 30 years."

Beer and nappies

There is a similar story told in the US regarding data mining done by Wal-Mart using Teradata back in the early 90s. The system spotted a trend that between 5pm and 7pm on a Friday night, people were more likely to buy beer and "diapers" in one transaction. The story is often retold suggesting that the system was able to spot that these purchases were made by young men on their way home from work, which is unlikely, as the system only had access to sales transactions, not the demographics of the purchaser. However, using that trend and moving the two products closer together produced more sales: or at least that's how the story goes.

The point is, though, that sifting data for that elusive nugget of gold is not new. What is new, however, are two things. First, the volume of data has increased massively. Second, and in a sense the more exciting point, much is now publicly available.