Building a slowly changing dimension
In Chapter 4, Designing the Serving Layer, we learned about the different ways to build Slowly Changing Dimensions (SCDs). In this section, we will learn to implement a few of them using ADF/Synapse Pipelines Mapping flows. We will implement the Type 2 SCD as it involves a slightly more complicated workflow. Once you know how to implement one of the SCDs, implementing the others will be similar.
Let's consider the following example scenario:
- We have a
DimDriver
dimension table in a Synapse SQL dedicated pool that contains the driver's data. This data doesn't change very often, so it is a good choice for an SCD. - Let us assume that the changes to the driver data appears periodically as a CSV file in a folder in Azure Data Lake Gen2.
- We have to build an ADF pipeline to take the data from the CSV file and apply it to the
DimDriver
table while maintaining its history. - We will build the Flag-based SCD option in this...