Design choices
To implement data transformations that are robust and scalable, we must make some conscious design choices. Agreeing on how and where you will transform your data will allow your team to collaborate more effectively and coherently across pipelines.
Where to apply transformations
As seen in Chapter 2, The Modern Data Stack, a high-level architecture of the data stack resembles the following:
Figure 6.2 – High-level architecture example of a data stack (see Chapter 2, The Modern Data Stack)
At each of these steps, you might consider applying transformations. For instance, as seen in Chapter 3, Data Ingestion, transformations performed during ingestion focus on shaping data into a format and structure that are compatible with the destination system, such as a relational database. This mainly involves parsing and translating source data. Sometimes, however, one might consider cleaning, aggregation, and enrichment during ingestion...