You're reading from Data Observability for Data Engineering Proactive strategies for ensuring data accuracy and addressing broken data pipelines

Product type Paperback

Published in Dec 2023

Publisher Packt

ISBN-13 9781804616024

Length 228 pages

Edition 1st Edition

Languages

Python

Tools

SQL Server

Concepts

Data Engineering

Authors (2):

Michele Pinto

Sammy El Khammal

View More author details

Table of Contents (17) Chapters

Preface

1. Part 1: Introduction to Data Observability

2. Chapter 1: Fundamentals of Data Quality Monitoring FREE CHAPTER

3. Chapter 2: Fundamentals of Data Observability

4. Part 2: Implementing Data Observability

5. Chapter 3: Data Observability Techniques

6. Chapter 4: Data Observability Elements

7. Chapter 5: Defining Rules on Indicators

8. Part 3: How to adopt Data Observability in your organization

9. Chapter 6: Root Cause Analysis

10. Chapter 7: Optimizing Data Pipelines

11. Chapter 8: Organizing Data Teams and Measuring the Success of Data Observability

12. Part 4: Appendix

13. Chapter 9: Data Observability Checklist

14. Chapter 10: Pathway to Data Observability

15. Index

Why subscribe?

16. Other Books You May Enjoy

What this book covers

Chapter 1, Fundamentals of Data Quality Monitoring, covers a general introduction to data quality and explains the key metrics used to measure it. It will also explain how data quality can be converted to Service Level Agreements (or contracts) to establish trust among data pipeline stakeholders.

Chapter 2, Fundamentals of Data Observability, will complete the user’s knowledge of data quality by adding the observability dimension, taking quality to the next level, and explaining how we can improve data quality monitoring to have real-time contextual information on data pipelines.

Chapter 3, Data Observability Techniques, covers how a data engineer can retrieve information from applications at run time. It will be an overview of the existing techniques and will explain their advantages and disadvantages regarding the efficient implementation of Data Observability.

Chapter 4, Data Observability Elements, provides an overview of the elements needed to collect contextual and real-time information from a pipeline. This will cover a description of those elements and showcase an example of how you can collect them within a Python script doing data manipulation.

Chapter 5, Defining Rules on Indicators, introduces the concepts of continuous validation of the data. The reader will understand how rules can be implemented by the data engineer, manually or in the code, to test the data and where such validation rules can be implemented.

Chapter 6, Root Cause Analysis, focuses on the data issues and how adopting the Data Observability approach simplifies and may even automate anomaly detection and troubleshooting. It will provide a method for Data Incident Management and anomaly detection examples.

Chapter 7, Optimizing Data Pipelines, explains how data observability can be used to manage several aspects of the data pipeline lifecycle such as the cost containment in data pipeline maintenance as well as to aim key aspects like automating documentation, managing catalog, mitigating anomalies, and reduce the change risk.

Chapter 8, Organizing Data Teams and Measuring the Success of Data Observability, focuses on how to introduce Data Observability in your team, describing the different kinds of Data Teams, the different types of organizations where these teams must fit, and how to measure the success of this initiative.

Chapter 9, Data Observability Checklist, suggests a method in the form of a checklist to implement Data Observability in the company pipelines, reviewing the common pitfalls and concerns we encountered when implementing data observability in various companies.

Chapter 10, Pathway to Data Observability, closes the book by providing data engineers with a technical roadmap to implement data observability in a first project and then at scale across the organization.

The rest of the chapter is locked

You're reading from Data Observability for Data Engineering Proactive strategies for ensuring data accuracy and addressing broken data pipelines

Table of Contents (17) Chapters

What this book covers

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you