Working with Delta Lake
Delta Lake is a layer between Spark and the underlying storage (Azure Blob Storage, Azure Data Lake Gen2) and provides Atomicity, Consistency, Isolation, and Durability (ACID) properties to Apache Spark. Delta Lake uses a transaction log to keep track of transactions and make transactions ACID compliant.
In this recipe, we'll learn how to perform insert, delete, update, and merge operations with Delta Lake, and then learn how Delta Lake uses transaction logs to implement ACID properties.
Getting ready
To get started, follow these steps:
- Log into https://portal.azure.com using your Azure credentials.
- You will need an existing Azure Databricks workspace and at least one Databricks cluster. You can create these by following the Configuring an Azure Databricks environment recipe.
How to do it…
Let's start by creating a new SQL notebook and ingesting data into Delta Lake:
Note
If you don't have time to follow...