Finding solutions to data quality issues – observability, data catalogs, and semantic layers
In the first part of this chapter, we saw how data quality can be impacted in many different ways. Whether it is an issue in the source system, a problem with your data pipeline infrastructure, or a data governance challenge, making sure the quality of your data is on par is crucial for making better business decisions and creating trust in your data. Luckily, we have a set of tools and techniques at our disposal to overcome issues around data quality. By considering the problems we have identified in the first part, we will look at a few solutions to overcome them.
The first solution or technique is observability. Observability is a concept from the software engineering and DevOps fields, where consistently observing issues is helpful to minimize or prevent the downtime of application systems. When translated to the data world, this means a tool that will give you alerts and visual...