Questions
Why is it helpful to consider both analyst and developer users in a data warehouse or data lake?
Why do analyst users of data lakes and developer users of data warehouses need extensive documentation on source datasets?
What are the pros and cons of having analysts access a data lake server directly?
How does having multiple foreign keys in a data warehouse make it more useful to analysts?
Imagine you are hosting an annual dataset in your data warehouse. One year when you receive the dataset, you learn that a new additional categorical variable is included that you find valuable, named
ADJPRICE
. For the next 2 years, you receive the dataset withADJPRICE
in it coded according to the same system, but the third year, you receive the dataset withoutADJPRICE
but withADJPRICE2
, which is coded slightly differently thanADJPRICE
. If you were to make a crosswalk variable to handleADJPRICE
andADJPRICE2
in datasets over all these years, what coding would it...