Deep diving into Amazon Athena
As mentioned previously, Amazon Athena is quite flexible and can handle simple and complex database queries using standard SQL. It supports joins and arrays. It can use a wide variety of file formats, including these:
- CSV
- JSON
- ORC
- Avro
- Parquet
It also supports other formats, but these are the most common. In some cases, the files that you are using have already been created, and you may have little flexibility regarding the format of these files. But for the cases where you can specify the file format, it's important to understand the advantages and disadvantages of these formats. In other cases, it may even make sense to convert the files to another format before using Amazon Athena. Let's take a quick look at each of these formats and understand when it makes sense to use each of them.
CSV files
A Comma-Separated Values (CSV) file is a file where each value is delineated by a comma separator and each record...