Deciding whether to ETL or query in place
The distinction between ETL and querying in place is blurred when using a service such as Athena. In the preceding sections, we reviewed common ETL use cases. In this section, we'll unpack the details that should go into deciding when the downsides of querying in place tilt the scale in favor of ETL. You might be curious why we've deliberately framed the choice as defaulting to querying in place. The reason is simple and comes to us courtesy of John Gail, who in 1975 theorized, "A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system." In many ways, querying the data in place can be viewed as the most straightforward starting point. Athena's scalability reduces the need to curate your data model to your access patterns highly. In Chapter...