You're reading from Data Observability for Data Engineering Proactive strategies for ensuring data accuracy and addressing broken data pipelines

Product type Paperback

Published in Dec 2023

Publisher Packt

ISBN-13 9781804616024

Length 228 pages

Edition 1st Edition

Languages

Python

Tools

SQL Server

Concepts

Data Engineering

Authors (2):

Michele Pinto

Sammy El Khammal

View More author details

Table of Contents (17) Chapters

Preface

1. Part 1: Introduction to Data Observability

2. Chapter 1: Fundamentals of Data Quality Monitoring FREE CHAPTER

3. Chapter 2: Fundamentals of Data Observability

4. Part 2: Implementing Data Observability

5. Chapter 3: Data Observability Techniques

6. Chapter 4: Data Observability Elements

7. Chapter 5: Defining Rules on Indicators

8. Part 3: How to adopt Data Observability in your organization

9. Chapter 6: Root Cause Analysis

10. Chapter 7: Optimizing Data Pipelines

11. Chapter 8: Organizing Data Teams and Measuring the Success of Data Observability

12. Part 4: Appendix

13. Chapter 9: Data Observability Checklist

14. Chapter 10: Pathway to Data Observability

15. Index

Why subscribe?

16. Other Books You May Enjoy

Summary

In this chapter, we saw why data quality is important. Data quality allows us to prevent and solve issues in data processes. We explored the dimensions of data quality and what measures can be taken.

Next, we analyzed the data maturity path that companies started on years ago and are still taking and how this path is bringing about the urgent need to have an ever-greater focus on data quality.

We also defined producer-consumer information bias, leading to a shift in responsibilities for data pipeline stakeholders. To solve this, we proposed using the service-level method.

First, data quality must be considered as a service-level agreement, which is a contract between the producer and the consumer. These contracts contain the expected level of quality the data users require.

Second, the agreements are processed by the data producers, who will create a set of objectives that aim to support one or several agreements.

Third, to ensure that the objectives are met, the producer must set up indicators to reflect the state of the data.

Finally, the indicators are used to detect quality issues by creating rules that can trigger actions on the side of the data producer through alerts. The validity of those rules can be used to create a scorecard, which will solve the information bias problem by ensuring everyone is well informed about the objectives and the way they are controlled.

In the next chapter, we will see why those indicators are the backbone of data observability and how data quality can be turned into data observability.

You're reading from Data Observability for Data Engineering Proactive strategies for ensuring data accuracy and addressing broken data pipelines

Table of Contents (17) Chapters

Summary

Authors (2)

Personalised recommendations for you