Solving the enterprise data optimization problem

Data center
(Image credit: Microsoft)

Speaking in March this year, NVIDIA CEO Jensen Huang said, “about 90% of [the data] generated every single year is unstructured data. Until now, this data has been completely useless to the world.”

If you take this comment at face value, this should be a major concern. The 90% figure is widely quoted as being representative of the enormous volume of emails, images, videos, audio, or indeed anything that organizations create and retain but which is not easy to categorize or analyze using traditional systems.

Steve Leeper

VP of Product Marketing, Datadobi.

It stands to reason that if organizations can’t make use of almost all the data they generate, they are operating with a significant gap between the information they hold and the value they derive from it.

Latest Videos From

So, how have we arrived here? The “uselessness” of unstructured data is partly because it isn’t well served by traditional management and analysis tools. As a result, many organizations still struggle to answer basic questions, such as: what data do we have, where is it, how is it used, and does it have any value?

Kicking the can down the road

It’s a set of challenges just about every business would like to address. Data visibility is a good example. How well, for example, can the typical business distinguish between active and inactive data or identify what is most valuable, redundant or no longer needed?

The answer is that many organizations can’t, and to get around the problem, they opt to retain just about everything. When their storage infrastructure starts to fill up, they simply add more. The problem with kicking the data strategy can down the road is that it’s also a hugely expensive control and management issue.

Along with saving what’s important, organizations almost inevitably retain significant amounts of inactive or valueless data. Adding to the cost is that this is often hosted on high-performance, high-cost business storage, an expense that, over time, can seriously escalate.

Unfortunately, the issues don’t necessarily end there. Without clear ownership and data handling processes, what might already be complex governance and compliance challenges are needlessly exacerbated.

This is problematic at any time, but is particularly painful now, given the extreme volatility of storage pricing. Something has to give, and many businesses need a strategic shift away from keeping data at almost any cost and towards a much greater focus on efficiency and effective management. The bottom line is that unstructured data sprawl has no upsides.

Turning chaos into order

As data volumes grow, these issues take on greater significance, particularly because the value of data rarely remains fixed over time. In many settings, the window for active data use is often only 30–90 days, after which its relevance begins to decline as new information is generated. Many businesses also find that over 60% of their stored data has not been accessed or modified for many years, yet is retained due to a lack of visibility or clear policy direction.

This reinforces the need for lifecycle-driven management, in which data is continuously assessed and moved, archived, or removed based on defined criteria rather than being retained indefinitely by default. The first step towards regaining control over data management and IT infrastructure spend is for businesses to develop a much better understanding of the data that exists across their environment. They also need insight into the associated (and detailed metadata), including age, activity, ownership, and other key tags.

It then becomes possible to separate high-value data assets from those that are no longer relevant or required. This is the basis for managing data more confidently and effectively across its entire lifecycle and empowers IT teams to make informed decisions about retention, archiving and deletion.

Throughout this process, consistency of approach and good governance are key, as in modern environments where data is increasingly distributed on-premises and in the cloud, a fragmented approach can quickly break down. Good governance helps keep everything on track by defining key milestones and accountability for data assets, while consistent management policies ensure that datasets are handled in line with both operational requirements and regulatory obligations.

With so many businesses now deeply data-dependent, taking back control over high-value data and having the confidence to archive and delete what’s no longer required should be an operational and financial priority. Get it right, and there’s the very real prospect of a win-win where enterprise data is not just better managed but more likely to support bottom-line success.

We feature the best data migration tools.

This article was produced as part of TechRadar Pro Perspectives, our channel to feature the best and brightest minds in the technology industry today.

The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/pro/perspectives-how-to-submit

TOPICS

VP of Product Marketing, Datadobi.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.