Understanding scale and latency
Ever wonder why companies ambiguously describe their products as fast or highly scalable without quantifying those superlatives? For a long time, I thought it was because they were hiding something. Maybe they didn't provide hard numbers because they weren't the fastest or had a terrible gotcha. As it turns out, performance is personal, with dozens of variables affecting how long a query will run. Even the differences between a successful query and an unsuccessful query can come down to random chance associated with your data's natural ordering. These are some of the reasons why companies do not provide straightforward performance figures for their analytics engines. However, this doesn't mean we can't identify useful dimensions for anticipating a workload's performance.
When evaluating Athena's performance, the first thing to understand is that Athena is not likely to be the fastest option. This may be the most...