Bootstrap aggregation
Bagging is an abbreviation of bootstrap aggregation. The bootstrapping technique originated in a context where analysts had to deal with a scarcity of data. With this statistical approach, subsamples were used to estimate population parameters when a statistical distribution couldn't be figured out a priori. The goal of bootstrapping is to provide a more robust estimate for population parameters where more variability is introduced to a smaller dataset by random subsampling with replacement. Generally, bootstrapping follows the following basic steps:
- Randomly sample a batch of size x with replacement from a given dataset.
- Calculate a metric or parameter from each sample to estimate the population parameters.
- Aggregate the results.
In recent years, bootstrap methods have been used for parameters of machine learning models as well. An ensemble is most effective when its classifiers provide highly diverse decision boundaries. This diversity in ensembles can be achieved...