Fundamentals of the Delta Lake storage format
In this section, we will learn about the Delta Lake storage format and its core objectives. To do so, we will first detour into the realm of data engineering, which will provide the context for the core objectives. Finally, we will see how the Delta Lake format achieves the core objectives and simplifies the data engineering process.
Delta Lake (https://delta.io/) is an open source storage format that enables an organization’s data engineering teams to build the Lakehouse (see Chapter 1, Introduction to Databricks). Delta Lake aims to bring the following data warehousing characteristics to your data lake:
- Reliability
- Simplicity
- Performance
- Governance
To understand why this is a big deal, we must understand how data engineering on data lakes is currently done, how it affects BI users, and why these characteristics are the holy grail for data engineering teams.