How to get the most out of your NVMe SSDs

WD My Passport on desk — (Image credit: Western Digital)

The market for enterprise storage has largely been driven by the need for high sustained performance, density, reliability and efficiency at a reasonable price – which is why hard disk drives (HDDs) were the top storage medium for enterprises for a number of years. These enterprises were able to overcome some of the performance and reliability challenges inherent in HDDs by using RAID technology, which was designed with HDDs specifically in mind.

More recently, with the introduction of Gen 4 NVMe SSDs, the performance gap between SSDs and HDDs has become too much to ignore. As enterprise applications started using more real-time data, HDDs could no longer keep up. Using NVMe SSDs solved the performance challenge, but endurance and reliability remained problematic issues.

In a nutshell: about SSDs

First a brief primer on SSDs themselves. SSDs (unlike HDDs) store data in cells. The cells can be single level (SLC), Two or Multi level (MLC), Triple level (TLC) or even Quad level (QLC). As the number of levels rise, the amount of storage density increases, but the endurance also reduces significantly due to the write mechanism, which is explained below.

SSDs have a limited number of write cycles after which the cells cannot hold the charge to store the data any longer. In addition, SSDs cannot overwrite existing data. Instead, it must be erased. And, this erasure must happen at a block level. Which means that if four pages in an otherwise empty block need to be kept and the others deleted, the three remaining pages of data must be written to a new block then the old block is deleted. This causes write amplification. In applications like databases, which have a lot of overwrites, the write amplification in SSDs can be as high as 30x. This gets worse as the storage density of the SSD is higher. So, the endurance of a QLC is much lower than that of TLC.

NVMe SSDs also have other characteristics such as significantly better sequential write performance, especially as the drives fill up. To maintain consistent performance, drives vendors typically recommend a specific fill rate. Beyond that, the garbage collection and other internal processes take far too long, and the performance of the drives go down significantly.

The trouble with RAID

RAID cards tend to exaggerate the write amplification on top of the typical application write amplification, thus lowering the endurance of the NVMe drives. Especially during disk rebuilds, RAID cards tend to rebuild the entire disk, which ends up causing a lot of write amplification. It would make far more sense to just rebuild the data. This would cause the write amplification to be less and the rebuilds would be much faster, especially if the drive was not full.

Balaji Ramanuja

Balaji Ramanuja, Director of Product Management, Pliops.

Traditional RAID also requires extremely high overprovisioning because it was designed with protection – not performance – in mind. To improve performance, nested RAID (like RAID 10, 50 or 60) was introduced, which needs more overprovisioning. A smarter solution would be to perform compression on the data so that the amount of data stored is reduced, and to then use the capacity savings to offset the overprovisioning.

A different approach

NVMe SSDs perform slower at smaller block sizes, especially during random access. Given that, it makes sense to combine multiple write operations together before writing to disk. Having a way to process the data before being written to disk would greatly benefit performance endurance and reliability. In essence, the storage stack needs to evolve to fully take advantage of NVMe drives. One solution to this is through the use of storage accelerators with onboard nonvolatile memory. There are many benefits of using a storage hardware accelerator:

All IO is written to the accelerator, which can improve latency significantly.
Priming the data before it is written to the NVMe SSD means that is can be compressed, sorted, indexed, packed and encrypted before being written to disk. This way all access to the data (read and write) can be sequentialized into large blocks to maximize the performance, even as the drives fill up.
The compression of data minimizes the IO written and read from the disk, providing a performance boost, especially for highly compressible data.
Parity calculation to provide protection of data. By spreading the parity calculation across multiple disks, it can prevent the need of a hot spare while taking advantage of the performance boost.

Using a hardware storage accelerator is far more efficient than a CPU due to the sheer number of I/O operations and calculations needed to be done. For example, an accelerator can perform the equivalent of 100s of Xeon Gold cores for storage performance, all within the power specifications of a PCIe slot (40W maximum). This also frees up the CPU for other application requirements.

By increasing the local protection of the data and disk endurance, the SLAs of the servers and applications go up. Fewer local server failures also mean less network traffic for backup and recovery purposes. This also increases the useful life of the server and brings down the overall total cost of ownership (TCO) of the server infrastructure – while also lowering the carbon footprint.

Looking Ahead

In order to take advantage of modern NVMe storage, traditional data protection technologies – like RAID – need to be reevaluated. Storage architectures are evolving to keep up and are tapping into the use of hardware storage accelerators to get the most out of modern SSDs. With this new approach, accelerated application performance, higher SSD endurance and usable life, and unlocked capacities are all possible.

We've rated the best backup software.

Balaji Ramanuja, Director of Product Management, Pliops.