Building the STG model for the first dimension
We have instructed dbt to load our CSV data in a table and we can use that as we would any other model through the ref(…)
function.
That would work, but when we consider the CSV under the seed folder could be a temporary solution only, like in this case, we prefer to take the data loaded as a seed in use through sources, as we do with any external data.
Defining the external data source for seeds
We will define an external data source to read our seeds in a metadata-driven way so that we can easily adapt if the data would stop coming from a seed and start to come from a different place, such as a table from a master data system or a file from a data lake.
Let’s define a YAML file to define the sources for the seeds, and then add the config for the seed that we have just created:
- Create a new file named
source_seed.yml
in themodels
folder. - Add the configuration for the
seed
external source, as in the...