Storing big data: why compliance is key

Storing all of this data is a daunting concept. If data growth rates double each year reaching 40 zettabytes by 2020 in the enterprise alone, something different has to be done. Throwing more storage and hardware at the problem does not solve it.

What is really required is an efficient way to keep this data for the time frames required, and data compression technologies have matured enabling you to reduce overall storage footprints by 20 to 40 times.

That translates to keeping that data for 20-40 times longer than before. With this type of compression, there is no re-inflation required for query and performance speeds up. With extreme data growth rates, it is the smartest thing to do.

TRP: What other challenges, issues and trends may be involved in storing data in the future?

DM: Interestingly, big data, Hadoop and analytics have become almost synonymous, which is actually misleading not only for vendors but, most importantly, for buyers.

Organisations need to figure out the business goals and value they are trying to accomplish which will be unique to each organisation.

Some data needs to be stored long-term because the business users deem it valuable and some data has to be kept for various compliance mandates.

Figuring that out is half the battle. In addition to the basic storage question and overall costs, there are a host of other requirements which span aspects such as data latency, performance SLA's and overall security and availability.

It is important to specify exact requirements at the outset so that the technology solutions selected check all those boxes.

Hadoop is a great platform for low-cost scale and managing multi-structured data sets, but it can also be misleading, and you can very quickly run into operational costs far exceeding what you expected at the outset.

Managing petabytes is a very different data centre challenge. Also, training resources new skills is a costly and time-consuming undertaking. You need to leverage what you have today and use standard tools with SQL being central to that when it comes to managing big data now and in the future.

TRP: Alternatively, how can organisations benefit from these trends? Are many taking advantage?

DM: According to Gartner research analyst Merv Adrian, only 8% of enterprises surveyed are in production on Hadoop today with a further 50% going live in the coming 2 years.

There are many business benefits to be gained by collecting and analysing new data sets, but it's also important to show incremental value.

Gone are the days where enterprises have the patience to wait 2 years before time-to-value is reached and multi-million dollar investments are committed. Rolling out Hadoop clusters for a very specific use case is the ideal and recommended approach.

TRP: What's next in big data? Are there any exciting developments we can expect in the near future?

DM: Big data is not going away or abating. We will continue to see more innovation in this area of data management. Hadoop adoption will grow, but it will require more maturity to meet the high expectations which mainstream enterprises have come to expect.

Tolerance levels for security, data governance and high availability will start to dissipate and these capabilities will need to be delivered faster and proven-out.

Real-time query and analytics is already on the rise and there are some very exciting developments in that area around in-memory database solutions, but on the other end of the spectrum organisations will start to be much smarter around "right-tiering".

Not all data is created equal and of the same value. Storing current data in the most expensive systems is a very different approach compared to storing multi-years of corporate memory and petabytes in lower cost storage or cloud environments. Organisations will be much smarter in how they approach that and solve those challenges.

Desire Athow
Managing Editor, TechRadar Pro

Désiré has been musing and writing about technology during a career spanning four decades. He dabbled in website builders and web hosting when DHTML and frames were in vogue and started narrating about the impact of technology on society just before the start of the Y2K hysteria at the turn of the last millennium.