Software-defined storage evolved: from scale-out architecture to SSDs

SDS looks to help crunch petabytes of data

Software-defined storage (SDS) is still a relatively new concept. While its definition varies among vendors, it is centred around decoupling storage intelligence from the hardware that data resides on, allowing data to be pooled and assigned to applications through automated policy-based management.

One benefit of this is that commodity storage from a wide range of hardware vendors can be added at any time without adding to management complexity, which is a key tenet of SDS according to US-based SDS vendor Nexenta.

Nexenta counts more than 5,000 businesses and cloud service providers among its customer base and is now focusing on advancing its SDS solutions to focus on scale-out architectures and all-flash SSD arrays.

We spoke to the company's chief product officer Thomas Cornely to find out more.

Shaking up storage

TechRadar Pro: How is software-defined storage shaking up the industry?

Thomas Cornely: Everybody is now calling themselves software-defined storage, which is funny. The way we look at it, there is a big difference between SDS and software-based solutions, which pretty much everybody out there does these days. Most vendors, which we call software-based storage vendors, are running on the same components and using software to to build the functionality.

That's good for them because in the end they are still selling the same systems they always have, and are charging the same margins that they've always charged. More than servers or networking, there's massive room for disruption in storage, which you can see by looking at the high margins EMC takes on its hardware products. This is what software-defined storage players like us are able to change; true SDS is good for customers.

TRP: How is SDS good for customers?

TC: It's about breaking the storage model; it's an economic argument and not about technology, per se. Technology is an enabler, but it's about how you deliver storage to the customer, and if you look at the software-defined space, there are only a few vendors.

There's Nexenta, and VMware, which is arguably now doing SDS solutions with vSAN, but that is only for VMware. Microsoft has Storage Spaces and Windows Storage Server, but they are only for Microsoft environments. It's the same sort of thing with RedHat.

Nexenta right now is the only player that can run on a wide variety of hardware partners that cater to all workloads. We work underneath VMware, Hyper-V, and Microsoft native environments, Linux Environments, CloudStack and OpenStack, and so on. It boils down to economics and bringing the costs down so that the customer can spend less on storage. From there, it's about flexibility and choice.

A customer may like to buy from Dell, so it can now get Dell end-to-end on the hardware side. Perhaps they want to use Dell for compute and storage, and then they may want to do things with HP, which they can because they get that choice without having to compromise on features and functionality.

TRP: How can software-defined storage help cloud service providers get up and running faster and for less cost?

TC: For us, historically, our customers would choose NexentaStor, which allows cloud service providers to build cost-efficient cloud backends for their CloudStack and VMware environments using NFS services as a service to the backend.

We're now seeing more software move toward OpenStack and its solutions, where customers are looking to scale deployments to not just a few petabytes, but tens of petabytes. I think that calls for new technologies, which is why we are soon launching NexentaEdge.

Capacity crunch

TRP: What is NexentaEdge and what are its main benefits?

TC: It's a new offering for us that we announced at VMworld in San Francisco that's tailored toward scale-out architectures deployed on top of Linux. We've managed to run and support OpenStack environments by delivering both block and object services to them.

The key for NexentaEdge is global inline deduplication, which allows data that gets stored in the cluster to get deduplicated as it goes in, meaning you only have to store those chunks once. Now, think about that. Why do people do object storage? It's because they are looking for the most cost-effective solution for large capacity configurations.

You can do that by running on commodity hardware using object to scale out and keep things simple. On top of this, you are able to run cost-effective hardware while adding compression and deduplication to be more efficient in terms of how you pay for capacity.

The other benefit is that you can do it as a backend for OpenStack, which is where you'd usually have an OpenStack environment where you would be running tens of thousands of virtual machines. But you don't need to do that as the OS only needs storing once, which means there's 10,000 copies of the same OS being store once in the cluster.

This allows us to do scale-out functionality on a petabyte scale. Those that do dedupe today are typically all-flash vendors, and that's it.

TRP: What is the underlying tech that allows this?

TC: We put in a lot of IP. The other part of Nexenta is that even though we're an open source company we have a lot of core IP that compliments what we do there. In this particular case there's something called flex hashing that allows us to do deduplication. That's where we place data in the cluster to give us dedupe almost by default, so the design is very important in how we approach NexentaEdge and object storage.

TRP: Is that based around an algorithm?

TC: There's the algorithm, but it's also about how you physically place it in the cluster, as well as how you hash it, and how you decide where to put it in the cluster based on the hash.

Fast flash

TRP: What is Nexenta doing with all-flash arrays and how can it help your customers?

TC: If you look at NexentaStor, the core technology we've been using to displace NetApp and so on, it's a great solution to go and do scale-out configurations, but it's not so much about the technology as it is the comfort level of the customer. In the past few years we've seen some of our cloud customers use NexentaStor for all-SSD configurations.

We have a customer in California that's deploying purely all-flash configurations, which is NexentaStor with all SSD. They're doing it today and it's working great for them, but we know that the software can do more and be more optimised for flash.

We've now announced an SSD mode that will be out before the end of the year which allows us to turn on a code path in the software that tunes it to be optimised with all-flash configurations. This gives you many ways to bring the economic benefits of SDS to all-flash arrays.

Today, all-flash arrays come in the form of software systems or appliances, and they tend to be very expensive and come with all kinds of limitations. You have to buy them from the same vendor and in most cases they are only block systems. Here, you get block and file systems, and software that can run on all-flash configurations through reference architectures. This means that you're not locked in or tied to a particular vendor, and you get to physically pick and choose the best SSDs that fit your needs.

You can then run Nexenta functionality on top of your storage, which I think will be a key disruptor not only to the core storage market, but also the emerging all-flash storage market.

TRP: What are some of the use cases for SSDs?

TC: I think they can be used for everything, but right now because of the cost argument, they tend to get used for high-value workloads. We want to bring the cost down and make SSDs relevant for an even wider range of workloads.

I think that, where there is a lot of random IO, SSDs make sense for the backends of virtual environments as it allows you to physically benefit from increased performance for a wider range of virtual machines. I think that's a key aspect.

Kane has been fascinated by the endless possibilities of computers since first getting his hands on an Amiga 500+ back in 1991. These days he mostly lives in realm of VR, where he's working his way into the world Paddleball rankings in Rec Room.