Using Aggregates to Clean Data and Examine Data Quality
In Chapter 3, SQL for Data Preparation, you learned how SQL can be used to clean data. While the techniques mentioned in that chapter do an excellent job of cleaning data, aggregates add a number of techniques that can make cleaning data even easier and more comprehensive. In this section, you will look at some of these techniques.
Finding Missing Values with GROUP BY
As mentioned in Chapter 3, SQL for Data Preparation, one of the biggest issues with cleaning data is dealing with missing values. You learned how to find missing values and how to resolve this issue. In this chapter, you will learn how to determine the extent of missing data in a dataset.
Using aggregates, identifying the amount of missing data can tell you not only which columns have missing data but also the usability of the columns when so much of the data is missing. Depending on the extent of missing data, you will have to determine whether it...