Organizing data on the Lakehouse
In this section, we will discuss the architectural components of a traditional data warehousing infrastructure/system and how these components can be designed and implemented on the Lakehouse. This is particularly interesting because, as we learned in Chapter 8, The Delta Lake, there is a single data layer on the Lakehouse known as Delta Lake. It does not have purpose-built database-like components that can be used for data warehousing components such as the operational data store or data marts.
Let’s start with a brief overview of the components of a generic data warehousing system implementation.
Components of a warehouse system
The following diagram shows the various components of a generic data warehouse system:
Figure 10.1 – Data warehouse infrastructure components
The data that is captured from source data systems enters the data warehousing system at the staging area. The data in the staging area...