Beating bottlenecks amid the big data deluge

Storage challenges for the modern enterprise

What is a storage bottleneck? And how can you avoid it?

Thomas Pavel, EMEA Storage Sales Director at Avago Technologies, told us about the strains caused by the data deluge and how your organisation can avoid them.

TechRadar Pro: What are the biggest challenges of the data deluge?

Thomas Pavel: The volumes of published information and data continue to grow unabated, fuelled by demanding applications like business analytics, social media, video streaming and grid computing. Many organisations, regardless of their area of business, want insight from new and unstructured sources such as news reporting, web usage trends and social media chatter.

The ability to access and retrieve data quickly is also a major factor contributing to business success and/or customer satisfaction. But there's a lot of data to handle. Just keeping up with this relentless growth and storing of data is challenge enough, but how to deal with such vast volumes of data cost-effectively? And perhaps most importantly: How to maintain or even improve storage performance?

TRP: What is a storage bottleneck? Where and when do bottlenecks tend to occur?

TP: As the volume of data increases, so too can the time it takes to access it. This is known as a 'bottleneck'. There are many potential locations for 'pain points' or bottlenecks in an enterprise system, so locating the bottleneck is not always simple. Addressing the bottleneck and maintaining performance is the rationale behind continual advances in storage technologies today.

When designing storage systems for performance, it is essential to understand where the bottlenecks can occur. This is especially true given that the bottlenecks change with each new generation of technology along the data storage path.

The three most critical elements that affect storage performance are the server's Peripheral Component Interconnect Express (PCIe®) bus, the SAS solution as implemented in host bus adapters (HBAs) and expanders, and the disk drives themselves, which can have either a SAS or a SATA interface.

Storage bottlenecks migrate among the successive generations of the various technologies involved end-to-end. With the advent of third generation PCIe, for example, second generation SAS became the new storage bottleneck. Third generation SAS is now able to take full advantage of third generation PCIe's performance, making PCIe the new bottleneck in systems using 12Gb/s SAS.

TRP: What guidelines can we use to maximise storage system performance?

TP: When designing a storage system for high performance, it is necessary to understand the throughput limitations of each element. Critical applications must also scale easily over time while remaining both highly protected and easily manageable.

SAS is now in its third generation, and the performance has doubled with each new generation from the original 3Gb/s to 6Gb/s and now 12Gb/s. SAS, like PCIe uses lanes and high performance storage systems normally aggregate multiple SAS lanes to support high data rates.

TRP: Does the storage bottleneck change with different system configurations?

TP: This table provides a summary of some sample configurations showing where the bottleneck exists when configured with a "full complement" of disks (the slowest element in the system). As shown, the need to support more disks (for capacity) requires the use of later generations of SAS and/or PCIe, and/or more SAS lanes.

Looking at it another way, in systems with a small number of disks, their relatively low aggregate throughput becomes the bottleneck, so there is no need to "over-design" the configuration with later generation technologies and/or more SAS lanes. The disks referenced in the table example all have a 6Gb/s interface with a throughput of 230MB/s and 550MB/s for the 15K RPM HDDs and SSDs, respectively.