Working with data manipulation language commands
Databricks SQL supports the following common data manipulation commands:
INSERT INTO
UPDATE
DELETE FROM
These are standard commands in the database and data warehouse world and do not require detailed unpacking.
Instead, we will learn about certain SQL commands in Databricks SQL that accommodate the data processing patterns specific to Lakehouse and Databricks SQL. Let’s start with the very versatile MERGE INTO
command.
MERGE INTO
MERGE INTO
is technically not a Databricks-specific command, but it is an important command as it allows you to process Slowly Changing Dimensions (SCDs) and the Change Data Capture (CDC), as well as perform data deduplication. Let’s learn about this command concerning these processes. MERGE INTO
is an advanced command which will appeal more to data engineers than data analysts. That said, if you are responsible for engineering data sets, you will find this section...