Building density data profiling scripts
Density profiling scripts are used to determine which values are used the most within a dataset. These scripts are also very informative to determine the percentage of information which has no values. They can be used in place of the domain values scripts if it is important to understand the distribution of information.
Getting ready
Identify all the attributes which require no null values, are used in filters not previously profiled, or require to be standardized from the semantic data model. In addition to these values, you should identify any flag in the system or attributes used to signify status of information.
How to do it...
Data density scripts can be resource-intensive, so be aware of when these are executed so as not to affect the performance of source systems:
1. Identify the necessary attributes from the semantic data model which relate to the requirements from the report mock ups.
2. Use the data lineage to determine the source systems...