Ingesting and processing batch data
Let's start by looking at the logical architecture of a data lakehouse:
The preceding diagram depicts the seven logical layers. Data from the data providers needs to be ingested and transformed. Traditionally, there are two types of batch data ingestion and transformation patterns:
- ETL
- ELT
Understanding these patterns is vital if you wish to understand how they can be combined for batch ingestion and processing in a data lakehouse.
Let's discuss these patterns in detail.
Differences between the ETL and ELT patterns
Let's discuss the differences between these patterns in detail. On the surface, these patterns may seem similar. However, there are differences in their philosophy and the services that are employed to transform data.
ETL
The first pattern is ETL. The following diagram depicts a typical ETL pattern: