Selecting the tools
For this example, and pretty much whenever we have an ETML problem, our main considerations boil down to a few simple things, namely the selection of the interfaces we need to build, the tools we need to perform the transformation and modeling at the scale we require, and how we orchestrate all of the pieces together. The next few sections will cover each of these in turn.
Interfaces and storage
When we execute the extract and load parts of ETML, we need to consider how to interface with the systems that store our data. It is important that whichever database or data technology we extract from, we use the appropriate tools to extract at whatever scale and pace we need. In this example, we can use S3 on AWS for our storage; our interfacing can be taken care of by the AWS boto3
library and the AWS CLI. Note that we could have selected a few other approaches, some of which are listed in Table 9.2 along with their pros and cons.