Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Practical Data Quality

You're reading from   Practical Data Quality Learn practical, real-world strategies to transform the quality of data in your organization

Arrow left icon
Product type Paperback
Published in Sep 2023
Publisher Packt
ISBN-13 9781804610787
Length 318 pages
Edition 1st Edition
Arrow right icon
Author (1):
Arrow left icon
Robert Hawker Robert Hawker
Author Profile Icon Robert Hawker
Robert Hawker
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Part 1 – Getting Started
2. Chapter 1: The Impact of Data Quality on Organizations FREE CHAPTER 3. Chapter 2: The Principles of Data Quality 4. Chapter 3: The Business Case for Data Quality 5. Chapter 4: Getting Started with a Data Quality Initiative 6. Part 2 – Understanding and Monitoring the Data That Matters
7. Chapter 5: Data Discovery 8. Chapter 6: Data Quality Rules 9. Chapter 7: Monitoring Data Against Rules 10. Part 3 – Improving Data Quality for the Long Term
11. Chapter 8: Data Quality Remediation 12. Chapter 9: Embedding Data Quality in Organizations 13. Chapter 10: Best Practices and Common Mistakes 14. Index 15. Other Books You May Enjoy

Causes of bad data

Any of these impacts can cause critical damage to an organization. No organization deliberately plans for data quality to be poor enough to be impacted in these ways. So, how do organizations end up impacted in this way? How does an organization neglect data sufficiently so that it can no longer achieve its objectives?

Lack of a data culture

Successful organizations try to put a holistic data culture in place. Everyone is educated on the basics of looking after data and the importance of having good data. They consider what they have learned when performing their day-to-day tasks. This is often referred to as the promotion of good data literacy.

Putting a strong data culture in place is a key building block when trying to ensure data remains at an acceptable level of quality for the business to succeed in its objectives. The data culture includes how everyone thinks about data. Many leaders will say that they treat data like an asset, but this can be quite superficial. Doug Laney’s book, Infonomics, explains this best:

Consider your company’s well-honed supply chain and asset management practices for physical assets, or your financial management and reporting discipline. Do you have similar accounting and asset management practices in place for your “information assets?” Most organizations do not.” (Laney, 2017)

Laney makes an interesting point. Accounting standards allow organizations to value intangible assets – for example, patents, copywrites, and goodwill. These are logged on an asset register and are depreciated over time as their value diminishes. Why do we not do this with data as well? If data had a value attributed to it, then initiatives to eliminate practices that eroded that value would be better received.

We will return to this in later chapters, but for now, suffice it to say that having a data culture is a key building block when striving for good data quality. Many organizations make statements about treating data as an asset and having a data culture, without really taking practical steps to make this so.

Prioritizing process speed over data governance

There is always a contention between the speed of a business process and the level of data governance involved in the steps of that process. Efforts to govern and manage data can often be seen as red tape.

Sometimes, a desire for a high process speed comes into conflict with the enforcement of these rules. There may even be financial incentives for process owners to keep processes shorter than a certain number of days/hours. In these cases, process owners may ask for the data entry process to be simplified and the rules removed.

In the short term, this may result in an improved end-to-end process speed – for example, in procurement, initial requests may be turned into purchase orders more quickly than before. However, as shown in Figure 1.3, a fast process with few data entry rules will result in poor data quality (box 1) and this is unsustainable.

In all these cases, the organization experiences what we call data and process breakdown – the dreaded box 2 in Figure 1.3. The initial data entry process is now rapid, but the follow-on processes are seriously and negatively impacted. For example, if supplier bank details are not collected accurately in the initial process, then the payment process will not be completed successfully. The accounts payable team will have to contact the supplier to request the correct details. If the contact details have also not been collected properly, then the team will have a mystery to solve before they can do their job! For one supplier, this can be frustrating, but for large organizations with thousands of suppliers and potentially millions of payments, processes are usually highly automated, and gaps like these become showstopping issues:

Figure 1.3 – Balance of process speed and data quality – avoiding data and process breakdown

Figure 1.3 – Balance of process speed and data quality – avoiding data and process breakdown

When establishing new processes, most organizations start in box 3, where the rules have been established but they are inefficient. For example, rules are applied in spreadsheet-based forms, but the form must be approved by three different people before data can be entered into a system. Some organizations (typically those in regulated industries) move further to the right into box 6 – where the data governance is so complex that process owners feel compelled to act. This often leads to a move back to box 1 – where the process owner instructs their team to depart from the data governance rules, sacrificing data quality for process speed. Again, this brings the data and process breakdown scenario into sharp focus.

Through technology, organizations tend to move to box 4 – for example, a web-based form is added for data input that validates data, connects to the underlying system to save the valid data, and automatically orchestrates approvals as appropriate. As these processes are improved over time, there is the opportunity to move to box 5 – for example, by adding lookups to databases of companies (for example, Dun and Bradstreet) to collect externally validated supplier data, including additional attributes such as details of the supplier risk and company ownership details. In the best cases, good master data management can contribute to a higher process speed than would otherwise have been possible.

There can be significant shifts in an organization’s position within this model when there is great organizational change. This might include a re-organization which might remove roles relating to data management, or a merger with another organization.

Mergers and acquisitions

Often, in merger and acquisition scenarios, two different datasets need to be brought together from different systems – for example, datasets from two different ERP systems are migrated to a single ERP. Often, these projects have extremely aggressive timelines because of the difficulties of running a newly combined business across multiple systems of record and the cost of maintaining the legacy systems.

When data is migrated in an aggressive timeline, the typical problems are as follows:

  • Data is not de-duplicated across the two different source systems (for example, the same supplier exists for both former organizations and two copies get created in the new system)
  • Data is migrated as-is without being adjusted to work in the new system – which may have different data requirements
  • Data was of poor quality in one or more of the legacy systems, but there is no time to enhance it in the project timeline

After a merger, there is usually a significant investment in the harmonization of systems and processes that cover the migration process. If the migration process encounters these problems and bad data is created in the new systems, a budget is rarely set aside to resolve the problems in a business-as-usual context.

You have been reading a chapter from
Practical Data Quality
Published in: Sep 2023
Publisher: Packt
ISBN-13: 9781804610787
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime