Bagging – building an ensemble of classifiers from bootstrap samples
Bagging is an ensemble learning technique that is closely related to the MajorityVoteClassifier
that we implemented in the previous section. However, instead of using the same training set to fit the individual classifiers in the ensemble, we draw bootstrap samples (random samples with replacement) from the initial training set, which is why bagging is also known as bootstrap aggregating.
The concept of bagging is summarized in the following diagram:
In the following subsections, we will work through a simple example of bagging by hand and use scikit-learn for classifying wine samples.
Bagging in a nutshell
To provide a more concrete example of how the bootstrapping aggregating of a bagging classifier works, let's consider the example shown in the following figure. Here, we have seven different training instances (denoted as indices 1-7) that are sampled randomly with replacement in each round of bagging. Each bootstrap sample...