How to manage data quality
In the previous section, we saw that there are many dimensions to data quality. So clearly it’s not a simple solution or service that can be quickly put together. Data quality needs to be a company-wide strategy. In order to build this strategy and be able to design and architect a solution, let us look at each dimension of data quality and investigate how that dimension can be implemented in a real system.
Accuracy
You can check accuracy by first ensuring that the right data enters the dataset when it’s written. Programmers writing stored procedures to insert or update records should know which fields are important and what the value ranges for those fields are. They should then write code to check for nulls, zeros, and data types. Early checks on quality at the source reduce quality check efforts later. For example, if there is a salary field in the database, you can run a check across the table to select all rows with zero or negative...