Summary
In this chapter, you were exposed to a number of areas for consideration in your designs. For data profiling, you want to really know your data, find data areas that are anomalies, and be ready to smooth the data, fill gaps, and if necessary, create temporary adjustments with synthetic data until real data can be supplied. When implementing a data factory, you need to know that the data is of the highest quality and that includes the core data, its metadata, and its trends. You should also include its data profile in that list.
You also learned that raw data is messy and has legitimate gaps and illegitimate (erroneous) gaps that need correction. It is important that you also know the shape of the data as defined by its profile so that it can be semantically maintained in downstream processing and not cause data to be misused.
You also learned about data calendars and that data has to be interpreted in context, one facet of which is the data calendar. This is essential...