Bagging is an additional ensemble type that, interestingly, does not necessarily involve trees. It builds several instances of a base estimator acting on random subsets of the first training set. In this section, we try k-nearest neighbors (KNN) as the base estimator.
Pragmatically, bagging estimators are great for reducing the variance of a complex base estimator, for example, a decision tree with many levels. On the other hand, boosting reduces the bias of weak models, such as decision trees of very few levels, or linear models.
To try out bagging, we will find the best parameters, a hyperparameter search, using scikit-learn's random grid search. As we have done previously, we will go through the following process:
- Figure out which parameters to optimize in the algorithm (these are the parameters researchers view as the best...