When Big Data companies require the highest performance available and the most reliable storage, they tend to want to use fixed, local data storage. They often require incredibly low latency so that there are no errors or disruptions. They need massive amounts of storage, often in Petabyte scale volumes. They need the highest throughput because the research they are conducting is part of a critical project, such as cancer research or developing new safety materials or pharmaceuticals. In short, they have no latitude for disruptions, slowdowns, faults, or throughput issues and need to replicate exactly what they would experience if they built their own data center. Yet, these same Big Data companies want all of the flexibility and scale of cloud storage (opens in new tab) yet don’t want to purchase expensive servers and related infrastructure.
Amazon EBS (or Elastic Block Store) is a product designed to mimic this exact scenario. It works with Amazon Elastic Compute Cloud (or EC2), so all of the storage is available in the cloud, yet it uses block storage in the same way you would expect if you had local drives and servers (opens in new tab) in a data center. That means your staff can create storage volumes as though they are in a data center as opposed to the more typical object storage method. (Object storage is more “elastic” in that files also contain metadata and identifiers, which means they can reside anywhere on a volume such that apps rely on the file pointers for accessing the data.)
With block storage, each volume acts as an actual raw data storage volume that can be mounted and used as though it is a local drive. Each of these volumes can be up to 16TB in size, and you can create and manage your own file system on the drives.
EBS provides two distinct options for Big Data companies that essentially replicate what they would experience if they had an on-site data center or even used local storage with an incredible amount of local data storage connected right in the same room. These two options will be very familiar to anyone who has built an on-premise data center.
With EBS, you can choose to mount a volume in your EC2 instance that uses SSD (solid-state drives) (opens in new tab), which helps when you need very low latency. This might include an enterprise app that requires extremely fast operations for transactional data or a relational database that is used with e-commerce software (opens in new tab). It might also involve a NoSQL database (NoSQL stands for Not Only SQL) for high-performance apps, possibly those used for a massive enterprise app.
The second option is to use HDD (hard disk drives) -- required for the highest throughput. This might involve a media streaming service where high input/output is required, a Big Data research project, data warehousing and log processing (opens in new tab), or for staging backups. It’s meant for any application that needs high-performance for moving the data as quickly as possible.
EBS also provides encryption (opens in new tab) for data, both when it is in transit for use in your applications and also when it is in rest. This helps reduce the likelihood of a data breach not only when the data is in the cloud and not used by your apps, but also when it is “in-flight” and part of a research project, used in a transactional database (opens in new tab), or processed by your apps.
Benefits of AWS EBS
Perhaps one of the most important advantages of using EBS is that it is easy to use and understand. It is essentially the same as a local storage drive that you can mount, configure, and attach to EC2 instance as though the drives are in the same room. With EBS, you can tune the drives, change settings, adjust performance variables, and setup your backups (opens in new tab) in the same way as you might do for a local storage array in your own data center or on a server.
In addition, EBS has several advantages related to reliability, availability, and durability. As mentioned earlier, you can attach volumes to an EC2 instance that are either powered by SSD or HDD storage (opens in new tab) depending on your needs. However, both options provide the same reliability you would expect from Amazon's AWS S3 (Simple Storage Service) (opens in new tab). The replication that occurs is guaranteed by Amazon with a 99.999% reliability rating.
There is also an important benefit related to the two options for which storage you choose. With SSD, there is a low-latency advantage and also a price-performance benefit. SSD is well-suited for applications that require raw speed and low latency, such as an enterprise app that is used at a large company to process transactional data. With HDD, companies doing major research projects for drug discovery or researching the materials used for a new bridge design often require the fastest throughput for massive amounts of research data.
With EBS, there are many of the same advantages of a local infrastructure and local storage without the need to build that infrastructure or data center.
- We've featured the best cloud hosting services (opens in new tab).