Bootstrap aggregation or bagging is the earliest ensemble technique adopted widely by the ML-practicing community. Bagging involves creating multiple different models from a single dataset. It is important to understand an important statistical technique called bootstrapping in order to get an understanding of bagging.
Bootstrapping involves multiple random subsets of a dataset being created. It is possible that the same data sample gets picked up in multiple subsets and this is termed as bootstrapping with replacement. The advantage with this approach is that the standard error in estimating a quantity that occurs due to the use of whole dataset. This technique can be better explained with an example.
Assume you have a small dataset of 1,000 samples. Based on the samples, you are asked to compute the average of the population that the sample represents. Now, a direct...