Defining data quality
The question of quality data, and the consequences of using it, is a topic as old as data itself. And yet, as old a problem as it is, data quality is often hard to define. There are many definitions being used, often interchangeably. Some emphasize different dimensions of the data itself, like its completeness, accuracy, and timeliness. But these dimensions are not sufficient to tell whether the data is of good quality.
Other definitions consider whether data is fit for purpose—which is rather broad—but it’s a better definition and the one that’s naturally used every time one looks at a dataset. When determining whether data is fit for purpose, one should also ask whether they trust the data enough to make a decision, take an action, or build on top of it.
Organizations today face significant challenges in extracting business value due to poor data quality. By being deliberate about how the business creates, manages, and provides...