You're reading from Data Engineering with Databricks Cookbook Build effective data and AI solutions using Apache Spark, Databricks, and Delta Lake

Product type Paperback

Published in May 2024

Publisher Packt

ISBN-13 9781837633357

Length 438 pages

Edition 1st Edition

Tools

Apache Spark

Concepts

Data Engineering

Author (1):

Pulkit Chadha

View More author details

Table of Contents (16) Chapters

Preface

1. Part 1 – Working with Apache Spark and Delta Lake FREE CHAPTER

2. Chapter 1: Data Ingestion and Data Extraction with Apache Spark

3. Chapter 2: Data Transformation and Data Manipulation with Apache Spark

4. Chapter 3: Data Management with Delta Lake

5. Chapter 4: Ingesting Streaming Data

6. Chapter 5: Processing Streaming Data

7. Chapter 6: Performance Tuning with Apache Spark

8. Chapter 7: Performance Tuning in Delta Lake

9. Part 2 – Data Engineering Capabilities within Databricks

10. Chapter 8: Orchestration and Scheduling Data Pipeline with Databricks Workflows

11. Chapter 9: Building Data Pipelines with Delta Live Tables

12. Chapter 10: Data Governance with Unity Catalog

13. Chapter 11: Implementing DataOps and DevOps on Databricks

14. Index

Why subscribe?

15. Other Books You May Enjoy

Applying changes (CDC) to Delta tables with Delta Live Tables

Delta Live Tables also supports CDC, a technique that’s used to identify and capture changes made to data in a source database and then deliver those changes in real time to a target system. CDC enables you to keep your data lake or data warehouse in sync with your operational databases and also supports real-time analytics and data science.

One of the challenges of CDC is how to handle slowly changing dimensions (SCDs), which are dimensions that store and manage both current and historical data over time in a data warehouse. For example, a customer dimension may have attributes such as name, address, and phone number that can change over time. Depending on your business requirements, you may want to track the history of these changes in different ways. Several types of SCDs define how to handle these changes, such as Type 1 (overwrite), Type 2 (add new row), Type 3 (add new column), and so on.

In this recipe...

The rest of the chapter is locked

You're reading from Data Engineering with Databricks Cookbook Build effective data and AI solutions using Apache Spark, Databricks, and Delta Lake

Table of Contents (16) Chapters

Applying changes (CDC) to Delta tables with Delta Live Tables

Authors (1)

Personalised recommendations for you

You're reading from Data Engineering with Databricks Cookbook Build effective data and AI solutions using Apache Spark, Databricks, and Delta Lake

Table of Contents (16) Chapters

Applying changes (CDC) to Delta tables with Delta Live Tables

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you