Planning the ETL approach
In this first section of the chapter, we will plan our ETL approach. First, we will begin by developing a data dictionary to help us specify variables – both the variables we would like to extract from the data provider, as well as the variables we would like to develop during transformation. Next, we will use PROC FREQ
and PROC UNIVARIATE
to conduct research on the native variables we select to inform the design of our derived variables and ETL process.
Finally, we will use the knowledge we gain not only to design derived variables and the ETL process but also to make decisions about serving up variables and maintaining SAS labels and SAS formats for variables in the warehouse. After we complete this planning phase, we will move on to the next section of the chapter, where we will create the transformation code we planned for in this section. Let's begin by specifying the data.
Specifying data with a data dictionary
Data warehouses are...