Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Data Observability for Data Engineering

You're reading from   Data Observability for Data Engineering Proactive strategies for ensuring data accuracy and addressing broken data pipelines

Arrow left icon
Product type Paperback
Published in Dec 2023
Publisher Packt
ISBN-13 9781804616024
Length 228 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Michele Pinto Michele Pinto
Author Profile Icon Michele Pinto
Michele Pinto
Sammy El Khammal Sammy El Khammal
Author Profile Icon Sammy El Khammal
Sammy El Khammal
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Part 1: Introduction to Data Observability
2. Chapter 1: Fundamentals of Data Quality Monitoring FREE CHAPTER 3. Chapter 2: Fundamentals of Data Observability 4. Part 2: Implementing Data Observability
5. Chapter 3: Data Observability Techniques 6. Chapter 4: Data Observability Elements 7. Chapter 5: Defining Rules on Indicators 8. Part 3: How to adopt Data Observability in your organization
9. Chapter 6: Root Cause Analysis 10. Chapter 7: Optimizing Data Pipelines 11. Chapter 8: Organizing Data Teams and Measuring the Success of Data Observability 12. Part 4: Appendix
13. Chapter 9: Data Observability Checklist 14. Chapter 10: Pathway to Data Observability 15. Index 16. Other Books You May Enjoy

Analyzing the data

The simplest and most immediate way to monitor what is happening is to focus on monitoring your data. At the beginning of our data quality journey, it seems correct to start with the data, usually because data is exactly what our customers will be using.

But as we saw in the previous chapter, the data itself is often not sufficient to identify and prevent anomalies; for example, even if we are sure that the data we are producing is correct, it is possible that in the meantime, we are ignoring that our new data pipeline is slowly increasing the execution time as well as the usage of hardware resources. Today, we may not notice an anomaly because it is not a critical problem and data is produced as expected, but it could soon become an issue before we realize it. Indeed, what appears to be a pipeline execution issue may soon be converted into a timeliness issue.

On the other hand, our main concern is to be aware of what is happening in our data. So, what does...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime