Skip to main content

What is Amazon Redshift?

What is Amazon Redshift?
(Image credit: Shutterstock)

When it comes to cloud computing services, Amazon Redshift is a powerhouse of data warehousing. Used by some of the largest companies in the world, including Ford Motor Company, Lyft, Intuit, and Pfizer plus countless more, the data warehouse is used to store cloud databases and the related production data. The concept of Big Data hinges on an ability to process, store, and analyze data in vast treasure troves, and that is essentially what Amazon Redshift provides.

More than almost any other product, Amazon Redshift has powered the advent of Big Data and data warehousing, allowing companies to build powerful applications and generate reports that provide all of the data they need to run a business.

The best way to understand Amazon Redshift is to start with some basic terms and what they mean, and how this adds up to a powerful, fast, and extensive data warehouse product. Once you understand the terms you can start seeing the benefits of the product.

Basics of data warehousing

One of the first things to understand about Amazon Redshift is that you can start small. Any company can sign up for a single node that allows you to store a database and the related data, and then to start running queries and reports on that data (and run your own custom applications). The first node you create is called the leader node. If you add more, they are called a compute node. You could define Amazon Redshift as a cluster of nodes.

Of course, it is far more complex than that -- Redshift forms the basis for a collection of cloud computing products that are part of Amazon Web Services. For the cloud storage component, there is Amazon S3 (or Amazon Simple Storage Service) which provides the object storage itself.

That said, many companies start with one node when they start a data warehousing initiative. As your data warehousing needs expand and change, you can add more nodes into a cluster. This helps you build more applications, run more queries, and perform more analytics. Pricing can depend on how long you want to keep those nodes active. The prices go down when you reserve nodes for longer periods of time, such as one year or three years.

Beyond that, the other important thing to know about Amazon Redshift is that most of the complexity occurs behind the scenes. This includes the endpoint security, management, scaling, provisioning, and anything else related to data warehousing. There’s a web console your IT service management team uses to deal with instances and to create new nodes, but you don’t have to plan or manage any of the performance characteristics, the back-ups or archives, or any of the infrastructure management related to the database or the data, including the servers or networks.

Amazon recently announced some improvements to Redshift. One of the key changes is that the nodes you use can be optimized separately for performance or storage. Previously, a cluster was maintained for both performance and storage allocation. Amazon also improved networking speed, especially the connections between Redshift and Amazon S3. Amazon claims the Redshift now delivers 3x the performance of competing data warehouse products.

Benefits of Redshift

As with any cloud computing initiative, the reason to use Amazon Redshift has to do with flexibility. As mentioned previously, companies can choose to create a single node as a starting point, but from there they can create massive clusters containing many nodes for every reporting need they have any for any web application. To say the possibilities are endless in terms of database control is not quite true but with cloud computing, it will seem like it.

Another benefit beyond flexibility in terms of what you can do and the applications you run, there is also an advantage to how it is all managed. Your Information Technology staff do not need to manage the cloud computing infrastructure at all, nor do they need to manage the servers, networks, or storage required. Since it is all in the cloud, and it is all part of Amazon Web Services (or AWS), it is all managed remotely and updated automatically.

One last benefit to consider is that Amazon Redshift provides the framework for a company to go beyond its current limitations. This might be a new application that uses a database in the cloud (and data stored in the cloud), or it might be a new way to analyze company data. Some firms even create brand new divisions and departments based on their newfound capability to understand and process data. An example of this might be an automotive manufacturer who has the ability to analyze data in real-time and develop autonomous driving features.

In the end, the power of Amazon Redshift is only limited by the imagination of the company starting a new initiative, developing a new product, or forming a new division.