Chapter materials
In this chapter, we will be working with three datasets. The first two come from data on wine quality that was donated to the UCI Machine Learning Data Repository (http://archive.ics.uci.edu/ml/index.php) by P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, which contains information on the chemical properties of various wine samples, along with a rating of the quality from a blind tasting by a panel of wine experts. These files can be found in the data/
folder inside this chapter's folder in the GitHub repository (https://github.com/stefmolin/Hands-On-Data-Analysis-with-Pandas-2nd-edition/tree/master/ch_09) as winequality-red.csv
and winequality-white.csv
for red and white wine, respectively.
Our third dataset was collected using the Open Exoplanet Catalogue database, which can be found at https://github.com/OpenExoplanetCatalogue/open_exoplanet_catalogue/. This database provides data in eXtensible Markup Language (XML) format, which is similar to...