In this chapter, we will be working with three datasets. The first two come from data on wine quality that was donated to the UCI Machine Learning Data Repository (http://archive.ics.uci.edu/ml) by P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, which contains information on the chemical properties of various wine samples, along with a rating of the quality from a blind tasting by a panel of wine experts. These files can be found in the data/ folder inside this chapter's folder in the GitHub repository (https://github.com/stefmolin/Hands-On-Data-Analysis-with-Pandas/tree/master/ch_09) as winequality-red.csv and winequality-white.csv for red and white wine, respectively.
Our third dataset was collected using the Open Exoplanet Catalogue database, which can be found at https://github.com/OpenExoplanetCatalogue/open_exoplanet_catalogue/. This database...