Chapter materials
In this chapter, we will be working with three datasets. The first two come from data on wine quality donated to the UCI Machine Learning Data Repository (http://archive.ics.uci.edu/ml/index.php) by P. Cortez, A. Cerdeira, F. Almeida, T. Matos, and J. Reis, and contain information on the chemical properties of various wine samples along with a rating of the quality from a blind tasting session by a panel of wine experts. These files can be found in the data/
folder inside this chapter's folder in the GitHub repository (https://github.com/stefmolin/Hands-On-Data-Analysis-with-Pandas-2nd-edition/tree/master/ch_10) as winequality-red.csv
and winequality-white.csv
for red and white wine, respectively.
Our third dataset was collected using the Open Exoplanet Catalogue database, at https://github.com/OpenExoplanetCatalogue/open_exoplanet_catalogue/, which provides data in XML format. The parsed planet data can be found in the data/planets.csv
file. For the exercises...