Summary
In this chapter, we introduced ensembles. An ensemble is a combination of models that performs better than each of its components. We discussed three methods of training ensembles. Bootstrap aggregating, or bagging, can reduce the variance of an estimator; bagging uses bootstrap resampling to create multiple variants of the training set. The predictions of models trained on these variants are then averaged. Bagged decision trees are called random forests. Boosting is an ensemble meta-estimator that reduces the bias of its base estimators. AdaBoost is a popular boosting algorithm that iteratively trains estimators on training data that is weighted according to the previous estimators' errors. Finally, in stacking a meta-estimator learns to combine the predictions of heterogeneous base estimators.