Entire books have been written on scikit-learn (http://scikit-learn.org/stable/). The scikit-learn library has numerous submodules. Only a few of these submodules will be used in this book (in Chapter 7, Making Predictive Models in Healthcare). These include the sklearn.linear_model and sklearn.ensemble submodules, for example. Here we will give an overview of some of the more commonly used submodules. For convenience, we have grouped the relevant modules into various segments of the data science pipeline discussed in Chapter 1, Introduction to Healthcare Analytics.
Introduction to scikit-learn
Sample data
scikit-learn includes several sample datasets in the sklearn.datasets submodule. At least two of these datasets, sklearn...