What is Amazon Athena?

What is Amazon Athena?
(Image credit: Shutterstock)

The answers companies need from their data can sometimes be elusive. We live in an age where data is in great abundance, especially with the expansion into cloud storage. But the tools to analyze and process that data are not always easy to use, overly accessible, or even that effective. The problem? Data has to reside somewhere, and most companies have to think about how it is stored, who will access it, how to make that secure, and most importantly how to make data access reliable and fast.

That’s where Amazon Athena can help. It’s a query service in that companies are able to run SQL queries against their data as though it resides in a local data center. It’s serverless in that you don’t have to manage the infrastructure at all or use database software to manage it. And, it’s extremely fast. Your staff can run SQL queries and expect results even on large datasets in a matter of seconds.

To use Amazon Athena, the data is first housed on Amazon S3 (Simple Storage Service), which is an object storage service that runs in the cloud. Amazon S3 is what makes the data accessible and safe to use, while Amazon Athena is the query service that provides the power to derive the results you need from the data. This means you don't need to concern yourself about designing databases.

Benefits of Amazon Athena

As with most Amazon Web Services, the major benefit to using Amazon Athena is that it provides great flexibility in how you run queries without the added complexity. One example of this is with a pharmaceutical company using the cloud for genomic research. Your staff might decide to run multiple queries against the data set, but normally each one requires setup and configuration to create a cloud database that can accept the queries. With Athena, the staff can run multiple concurrent queries all at the same time but trust the results will be clean and accessible within seconds. These actionable results from queries will mean that companies have access to clean, reliable data to make better decisions and continue their research.

Another advantage to Athena related to this is a lower cost. Companies don’t have to manage the footprint required for the datasets, so if they do run multiple queries or need to make decisions related to a vast treasure trove of data, they don’t have to first improve the IT infrastructure or configure their data storage to handle the higher number of requests. Athena expands and retracts performance variables as needed for the queries at hand.

As mentioned earlier, Athena is flexible enough to handle a variety of tasks related to database queries. It runs standard SQL and supports standard data formats such as CSV, JSON, ORC, Avro, and Parquet. Athena uses Presto -- an open-source SQL query engine -- with ANSI SQL support, so it is not a proprietary query service users will have to learn from the ground up. Athena lets you run quick SQL queries but also supports more complex joins and arrays.

In the end, the power comes into play with Amazon Athena because it runs within Amazon S3, so all of the benefits of that object storage platform for your database carry over to Athena in terms of reducing complexity, providing the endpoint security and performance needed, and allowing companies to run multiple queries without having to manage or configure the infrastructure. Companies can focus more on the actual queries and results, not on the platform itself.

John Brandon
Contributor

John Brandon has covered gadgets and cars for the past 12 years having published over 12,000 articles and tested nearly 8,000 products. He's nothing if not prolific. Before starting his writing career, he led an Information Design practice at a large consumer electronics retailer in the US. His hobbies include deep sea exploration, complaining about the weather, and engineering a vast multiverse conspiracy.