From data quality monitoring to data observability
The general way of conducting data quality involves manual and automated checks, also called tests, on process inputs and outputs. In this paradigm, on the one hand, the consumer is responsible for checking the validity of their raw material according to their proper needs – for instance, by validating the schema you are receiving. On the other hand, the producer checks the conformity of the output data regarding consumers’ needs by ensuring, for instance, that data manipulation did not deteriorate its completeness. Often, if the data team arranges a well-running data quality program, the inputs won’t be checked by the consumers as they expect the inputs to be already validated.
The following figure explains this model; the data quality process ensures that the inputs and outputs are in line with quality expectations:
Figure 2.1 – Data quality outside the application
...