The data capacity gap: why the world is running out of data storage

Another forward-looking team at Harvard University's Wyss Institute later brainstormed their way to successfully storing 5.5 petabyes, or 700 terabytes, of digital data into a single gram of DNA.

TRP: Is Seagate developing any new storage solutions at the moment?

MW: Heat-assisted magnetic recording (HAMR) is one new technology that Seagate is investing in. This method uses lasers to first heat the high-stability media before magnetically recording data. HAMR is expected to increase the limit of magnetic recording by more than a factor of 100 and this could theoretically result in storage capacities as great as 50 terabits per square inch – current hard drives generally have a capacity of a only few hundred gigabits per square inch.

To put this in perspective, a digital library of all books written in the world would be approximately 400 TB – meaning that in the very near future conceivably all such books could be stored on as few as 20 HAMR drives.

While these technologies are still some way from our desks and data centres, these advances and others like them are certainly on their way. Innovation combined with the plunging cost of components is ultimately what's needed if we are to keep up with the world's growing demand for data storage.

TRP: Will CIOs need to supplement existing storage resources?

MW: CIOs certainly need to consider the implications of a data capacity gap for their business and address it by thinking strategically and longer-term with regards to their storage resources.

Typical big data resides on traditional disk storage, using standard hardware and software components. Since companies began to store information, a large amount of standard infrastructure has built up around the process. Data centres today include legacy components comprised of hardware and software stack components.

This approach is highly inefficient – in a single system there will often be several unsynchronised components caching the same data, each working independently, but with very poor results. In order for a company to get to a better cost and efficiency model, to match the requirements in the future, a better solution must be put in place.

One of the latest big data storage methods is a tiered model using existing technologies. This model utilises a more efficient capacity-tier based on pure object storage at the drive level. Above this sits a combination of high performance HDD (hard disk drives), SSHD (solid state hybrid) and SSD (solid state drives).

SSHD hybrid technology such as this has been used successfully in laptops and desktop computers for years but today it is only just beginning to be considered for enterprise-scale data centres. This new method allows the most critical data to sit on the more expensive SSDs or hybrids, where it is easy and quick to access and well-placed to be processed by analytics platforms, while the less valuable metadata sits on cheaper HDDs where it is still available and secure, but slower to access.

This potential part-solution to the data capacity gap is part of a growing trend for larger, more efficient data centres. That the world has grown from a planet producing just under one zettabyte per annum back in 2009, to potentially well over 44 in 2020, is truly astounding. Managing this – whether you're a technologist responsible for managing data, a business user who has the task of analysing it, or a consumer trying to manage the flood of your own digital information – will be an interesting challenge for all of us.