Useless data and carbon waste; the dark side of digitization

An abstract image of a database
(Image credit: Image Credit: Pixabay)

As the world works towards vital net zero targets, digitization has become essential in delivering efficient, green strategies. Data is critical in driving better business outcomes and ensuring a sustainable future. On the flip side, however, our new data-driven world throws up its own sustainability challenges. The data centers that house our digital reserves require vast amounts of energy. Global emissions from cloud computing, for example, are predicted to amount to over 3.5% of greenhouse gas emissions, even more than commercial flights.

Over the last decade efforts have been made to ensure data centers are more sustainable. However, while the infrastructure can be made greener, the issue of wasted storage makes efforts harder. The continued storage of useless data drains precious resources. According to Veritas research the power it takes to store such dark data wastes up to 6.4m tons of carbon dioxide yearly. Analysts predict that by 2025 there will be around 91ZB of dark data being held unnecessarily - that is over four times today’s volume.

Dark data

On average, our research found that 52% of data stored by organizations is ‘dark’; its content and value are unknown, and it is essentially useless until its value (if any) is determined. At the same time, it is estimated that around one third of organizational data is Redundant, Obsolete and Trivial (ROT).

In short, swathes of data are being stored for no reason. ROT data is a major contributor to high storage costs; recent global research suggests over nine in ten organizations exceed their cloud budgets, overspending by an average of 43% mainly on storage, backup and recovery. Much has been said about the financial cost of dark data, but the environmental cost is too often overlooked. Deleting massive spawls of data waste could help drastically reduce organizations' carbon footprint, leading to greater sustainability and lower costs. As such, businesses must get on top of their data management strategies, use the right tools to identify valuable data and rid their data centers of unnecessary, energy-draining ‘dark data’.

Ian Wood

Ian Wood is the Senior Director and Head of Technology at Veritas.

Data management is crucial

Data management is a crucial first step for organizations to effectively analyze data at scale.

This starts with data mapping and discovery, understanding how information flows through an organization. Gaining visibility and insight into where data and sensitive information are stored, who has access and how long it is being retained is the first port of call when identifying dark data. However, it’s important that organizations invest in an ongoing program of proactive data management. This allows organizations to gain visibility into their data, storage, and backup infrastructure and make insight-led decisions related to data deletion on an ongoing basis. Accumulated dark and ROT data is a drain on all resources.

Additionally, data minimization and purpose limitation can reduce the amount of data being stored and ensure what is retained is directly related to its purpose. Using classification, flexible data retention policies, and compliant policy engines means that there can be confident deletion of non-relevant information. This not only reduces the amount of dark data feeding off data center resources but can also ensure compliance with data protection regulations such as GDPR.

For many organisations reducing dark and ROT data is not a simple task, especially when handled manually. The process can be complex, with many enterprise data management solutions retaining a manual deployment and maintenance approach, slowing operational agility.

With the amount of data being created and stored exploding, this is not a task enterprises can afford to do manually. Automating analytics, tracking, and reporting of dark data is essential when handling potentially petabytes of data and billions of files. Additionally, the need for multi-cloud strategies has necessitated the development of a new approach to data management.

Autonomous data management

The ultimate tool for organizations is now autonomous data management. Here, artificial intelligence (AI) and machine learning (ML) technologies allow the automation of data management processes and minimize human intervention and oversight. By automating the provisioning, optimizing, healing, and configuring of data management technologies across multi-cloud environments, businesses can gain a much clearer, more accurate picture of their data in a much shorter space of time – no matter what it is or where it is stored.

For example, enterprise data management platforms can now autonomously classify cloud-based data, deduplicate unnecessary, redundant data in the cloud, and archive or delete obsolete and trivial cloud data. Such an automated data insight approach should also be integrated with archiving, backup and cybersecurity solutions to prevent data loss and ensure policy-based data retention.

Ongoing digital transformation makes organizations' requirements for data content and context a priority, not least when so many of those transformative projects seek to deliver greater sustainability. The energy used to store useless data is pure waste. Imagine if we could automatically remove 85% of this useless data from data centers - this would enable a huge leap towards net-zero.

Reducing the environmental impact of our data storage footprint will be imperative if we are to avoid creating an even larger mass of waste data as the cloud evolves. Green strategies powered by digitalization cannot be let down by the shadow of dark data sapping energy in the background, silently undoing good work. The journey to a sustainable cloud is reliant on tackling data waste. The best solution for managing data waste in a complex, hybrid, and multi-cloud environment is autonomous operation, minimizing reliance on manual processes by combining hyper-automation with data-driven intelligence.

We've featured the best data migration tools.

Ian Wood

Ian Wood is the Senior Director and Head of Technology at Veritas, a global leader in data management. He has over 29 years of working experience and is a passionate of technology.