Chapter 11: Ad Hoc Queries with Amazon Athena
In Chapter 8, Identifying and Enabling Varied Data Consumers, we explored a variety of data consumers. Now, we will start examining the AWS services that some of these different data consumers may want to use, starting with those that need to use SQL to run ad hoc queries on data in the data lake.
SQL syntax is widely used for querying data in a variety of databases, and it is a skill that is easy to find. As a result, there is significant demand from various data consumers for the ability to query data that is in the data lake using SQL, without having to first move the data into a dedicated traditional database.
Amazon Athena is a serverless, fully managed service that lets you use SQL to directly query data in the data lake, as well as query various other databases. It requires no setup, and the cost is based purely on the amount of data that is scanned to complete the query.
In this chapter, we will do a deep dive into Athena...