Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Practical Data Quality

You're reading from   Practical Data Quality Learn practical, real-world strategies to transform the quality of data in your organization

Arrow left icon
Product type Paperback
Published in Sep 2023
Publisher Packt
ISBN-13 9781804610787
Length 318 pages
Edition 1st Edition
Arrow right icon
Author (1):
Arrow left icon
Robert Hawker Robert Hawker
Author Profile Icon Robert Hawker
Robert Hawker
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Part 1 – Getting Started
2. Chapter 1: The Impact of Data Quality on Organizations FREE CHAPTER 3. Chapter 2: The Principles of Data Quality 4. Chapter 3: The Business Case for Data Quality 5. Chapter 4: Getting Started with a Data Quality Initiative 6. Part 2 – Understanding and Monitoring the Data That Matters
7. Chapter 5: Data Discovery 8. Chapter 6: Data Quality Rules 9. Chapter 7: Monitoring Data Against Rules 10. Part 3 – Improving Data Quality for the Long Term
11. Chapter 8: Data Quality Remediation 12. Chapter 9: Embedding Data Quality in Organizations 13. Chapter 10: Best Practices and Common Mistakes 14. Index 15. Other Books You May Enjoy

Basics of data profiling

Data profiling assesses a set of data and provides information on the values, the length of strings, the level of completeness, and the distribution patterns of each column. For example, for both values and string lengths, the minimum, maximum, mean, and median are provided to help identify outliers.

Most of you will have some experience in data profiling – even if you have not heard the term before. The first task that many people perform when looking at an unfamiliar set of data is to open it in a spreadsheet tool and apply a filter (the autofilter feature in Microsoft Excel, for example) to all the columns. They will check all values in each column, looking to see whether the column contains a couple of values that all the rows are associated with, or whether there are many. People look to see whether the data is a number, a date, text, and so on. It’s quite common to look for the smallest and largest values. Even this basic action is an...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime