Selecting the tools
For this example, and pretty much for whenever we have an ETML problem, our main considerations boil down to a few simple things, which we will cover in the following sections.
Interfaces
When we execute the extract and load parts of ETML, we need to consider how to interface with the systems that store our data. It is important that whichever database or data technology we are extracting from, we use the appropriate tools to extract at whatever scale and pace we need. In this example, our interfacing can be taken care of by the AWS boto3
library and the S3 Application Programming Interface (API) it surfaces.
The following table shows the pros and cons of using this option:
In the next section, we will consider the decisions we must make around the scalability of our modeling approach. This is very important when working with...