Understanding how Amazon Athena works
Amazon Athena initially intended to work with data stored in Amazon S3. As we will see in a later section in this chapter, it can now work with other source types as well.
This feature of Amazon Athena is a game-changer. You can combine disparate data sources just as easily as if they all had the same format. This enables you to join a JSON file with a CSV file or a DynamoDB table with an Amazon Redshift table.
Previously, if you wanted to combine this data, performing a combination programmatically would invariably translate into a long development cycle and more than likely not scale well when using large datasets.
Now all you have to do is write a SQL query that combines the two data sources. Due to the underlying technology, this technique will scale well, even when querying terabytes and petabytes of data.
Data scientists and data analysts will be able to work at a speed that would have been impossible just a few years ago.
Under the hood, Amazon...