Learning how to query data stored in Amazon S3 with Amazon Athena
Businesses store vast amounts of data in repositories such as Amazon S3. A lot of this data is not necessarily being hosted on regular Amazon RDS or NoSQL databases. In many cases, this is because the dataset is not being regularly updated and queried. Previously, even if you wanted to perform ad hoc queries or analysis against some of that data, you would need to ingest it into a database and then run your queries against the database.
Amazon Athena is a fully managed serverless solution that allows you to interactively query and analyze data directly in Amazon S3 using standard SQL. There is no infrastructure to provision, and you only pay for the queries you run.
Amazon Athena uses Presto, which is an open source SQL query engine that's designed to allow you to perform ad hoc analysis. You can use standard ANSI SQL, which provides full support for large joins, window functions, and arrays.
Data can...