Why big data is crude oil – while rich data is refined, and the ultimate in BI

Big data

It's talked up as being an epoch-defining change, but big data on its own is useless. Created by combining open, freely available data with data owned by both citizens and businesses, 'rich data' is a far more valuable commodity.

What's the difference between big data and rich data?

"It's like the difference between crude and refined oil," says Dr. Rado Kotorov, Chief Innovation Officer at Information Builders. "Combining data provides new context and new use cases for the data. For example, combining social media data with transactional data can provide insight into purchases and thus lead to product innovation."

Rich data is created by combining data from different systems. Rich data has context, and thus, is useful in practical terms to both businesses and individuals. "Credit card processing companies sell benchmarking data to merchants," says Kotorov. "These merchants can see general market trends and compare those with their own observations of the market to make better decisions, or understand gaps in their own operations."

Context is everything – a recent study in Harvard Business Review shows that location-based offers to shoppers increases the odds of purchasing by 76%. Rich data could help healthcare providers and even fight crime, too.

Why do we need rich data?

Rich data is nothing short of cutting-edge business intelligence. "Rich data can be used to answer different kinds of questions that would previously have been difficult," says Southard Jones at cloud business intelligence and analytics company Birst. "Linking up multiple sources of information can help see things in new ways or across the whole process, rather than just one team's responsibility."

For example, imagine a sales team analysing which products to sell to which customers. Instead of looking at sales data in isolation to see who bought what last year, a rich data approach is to look beyond sales data and see the effect of marketing campaigns, and finance (how quickly the customer pays), too.

Rich data is about predicting behaviour. Selling the right product to the right person at the right price is what sales is all about, but none of this presently relies on data. "Often sales rely on gut feel and experience," says Jones. "Replacing that with a system where a customer's propensity to buy is clearly indicated allows sales to prioritise their efforts and improve productivity and accuracy."

What's wrong with big data?

It's far too shallow to use. "The computing power of the cloud has enabled us to collect, store and process vast levels of data, but with big data it is inevitable that that we will also collect lots of duplications, deviations and duds," says Nigel Beighton, VP of Technology, Rackspace, who says that without big data, we can't have rich data. "Rich data is the diamond in the rough."

Jon Cano-Lopez, CEO of REaD Group

Jon Cano-Lopez, CEO of REaD Group

"We often say to our clients that while their own customer data is highly accurate because it reflects their actual transactions, it can be pretty limited in terms of the overall depth it provides," says Jon Cano-Lopez, CEO of REaD Group. "A telecoms company will know how people use their phones and data, their geographic location, and even who their friends are via their telephone numbers," says Cano-Lopez. "However, they don't know who the people really are, what job they do, what their interests are, how much they earn, their family make-up, and what makes them tick."

Unstructured data might tell you about two seemingly identical heavy users, but look closer and one person could be a high volume business user, while the other is a socialite with many friends. An additional, 'rich' layer of data will add the depth, helping to identify the true value of a customer. "Combining transactional data that is based on a customer's activity, with their lifestyle information, provides a much fuller picture," says Cano-Lopez.

What benefits can rich data bring?

It's possible to create huge potential benefit by enriching data and using it in real time. "Medical device data and unstructured data in the form of clinical staff notes are being mined to support earlier diagnosis of conditions like sepsis (blood poisoning)," says Matt Pfeil, Chief Customer Officer at DataStax. "This involves real-time comparison of patient data against a centralised, anonymous set of data."

Creating rich data allows life-threatening conditions to be spotted and treated early, though only if there's consent of the patient; their anonymised data can be used in the future for the treatment of others.

Jamie Carter

Jamie is a freelance tech, travel and space journalist based in the UK. He’s been writing regularly for Techradar since it was launched in 2008 and also writes regularly for Forbes, The Telegraph, the South China Morning Post, Sky & Telescope and the Sky At Night magazine as well as other Future titles T3, Digital Camera World, All About Space and Space.com. He also edits two of his own websites, TravGear.com and WhenIsTheNextEclipse.com that reflect his obsession with travel gear and solar eclipse travel. He is the author of A Stargazing Program For Beginners (Springer, 2015),