This chapter served as an introduction to machine learning in Python. We discussed the terminology that's commonly used to describe learning types and tasks. Then, we practiced EDA using the skills we learned throughout this book to get a feel for the wine and planet datasets. This gave us some ideas for what kinds of models we would want to build. A thorough exploration of the data is essential before attempting to build a model.
Next, we learned how to prepare our data for use in machine learning models and the importance of splitting the data into training and testing sets before modeling. In order to prepare our data efficiently, we used pipelines in scikit-learn to package up everything from our preprocessing through our model.
We used unsupervised k-means to cluster the planets using their semi-major axis and period; we also discussed how to use the elbow point...