In this chapter, we presented the main concept of creating bootstrap samples and estimating bootstrap statistics. Building on this foundation, we introduced bootstrap aggregating, or bagging, which uses a number of bootstrap samples to train many base learners that utilize the same machine learning algorithm. Later, we provided a custom implementation of bagging for classification, as well as the means to parallelize it. Finally, we showcased the use of scikit-learn's own implementation of bagging for regression and classification problems.
The chapter can be summarized as follows. Bootstrap samples are created by resampling with replacement from the original dataset. The main idea is to treat the original sample as the population, and each subsample as an original sample. If the original dataset and the bootstrap dataset have the same size, each instance has a probability...