Are we all doing data wrong?

Confidential, sensitive and unstructured data is lying dangerously dormant

Is big data getting too big? A whopping 90% of all existing data in the world was produced in just the last two years – and only 20% of it is being used. With big data analytics unable to keep pace, this exponential rise is proving impossible to cope with, and there’s one very obvious result: most of the data being collected by businesses, individuals and by Internet of Things sensors is not being used. 

Unstructured, unused and unloved data lurking on the computers, servers and archives of organisations across the globe is clear evidence that businesses, while becoming increasingly digitised and data-centric, are still living in the dark ages.

Dark data can be metadata produced by other systems (Image Credit: NASA)

What is dark data? 

Unused or 'dark' data is the story of the business world failing to live up to expectations on a massive scale. Dark data is defined by Gartner as ‘the information assets that organisations collect, process and store during regular business activities, but generally fail to use for other purposes’.

“Primarily we are talking about transactional information, log files, metadata which has not been used, small bits of unanalysed information which appear to have no value and may well be seen as the waste product of other systems and processes,” says John Culkin, Director of Information Management at Crown Records Management, who advises firms on data policies. He also adds to that list draft, temporary and old emails, and ZIP files. 

Dark data can lurk on unused company laptops

Where does dark data come from? 

“Dark data makes up around 80% of total content in any organisation,” says Stephen Mackey, Senior IM Consultant at information management firm Kefron, who insists that it’s the result of standard day-to-day business processes. “Dark data is all the content that is left behind, hidden in systems and servers, and underused or forgotten about,” he adds.

According to IDC, 90% of unstructured data is never analysed, which is often the result of a dangerous anti-delete attitude, fuelled by both compliance regulation and the availability of cheap data storage in the cloud and elsewhere.

“For a retail or manufacturing company, for instance, financial information may be rightly kept as a record,” says Culkin, adding, “but although data generated by many sales and delivery systems is not required, it is rarely deleted.” But a conservative attitude towards data conversely creates risk.

There could be 21 billion IoT devices by 2020

Why is dark data damaging? 

There are two main ways dark data can damage a business. Firstly, there’s a security risk in not deleting data. “It’s important that files are not forgotten about,” says Mackey. “If they are not monitored and kept safe, the business-critical information they contain could be mined without knowledge and used for nefarious reasons.”

Data that isn’t going to be used should either be deleted or protected from unauthorised access, because confidential, sensitive and unstructured information could include customer account details, which produces compliance issues. 

The second way dark data can harm a business is by indirectly costing it money. “Many businesses are unaware of what kind of data even exists, and it’s this hidden data that hampers internal reviews and external audits,” says Mackey. “What if an issue is raised about an account from two years ago, and payment is called into question, but the invoices and records cannot be found?” he asks. The answer is simple; dark data costs businesses money.