Working with Delta Lake tables
Delta Lake databases are Atomicity, Consistency, Isolation, and Durability (ACID) property-compliant databases available in Databricks. Delta Lake tables are tables in Delta Lake databases that use Parquet files to store data and are highly optimized for performing analytic operations. Delta Lake tables can be used in a data processing notebook for storing preprocessed or processed data. The data stored in Delta Lake tables can be easily consumed in visualization tools such as Power BI.
In this recipe, we will create a Delta Lake database and Delta Lake table, load data from a CSV file, and perform additional operations such as UPDATE
, DELETE
, and MERGE
on the table.
Getting ready
Create a Databricks workspace and a cluster, as explained in the Configuring the Azure Databricks environment recipe of this chapter.
Download the covid-data.csv
file from this link: https://github.com/PacktPublishing/Azure-Data-Engineering-Cookbook-2nd-edition...