Bagging, or bootstrap aggregating, is the first generative ensemble learning technique that this book will present. It can be a useful tool to reduce variance as it creates a number of base learners by sub-sampling the original train set. In this chapter, we will discuss the statistical method on which bagging is based, bootstrapping. Next, we will present bagging, along with its strengths and weaknesses. Finally, we will implement the method in Python, as well as use the scikit-learn implementation, to solve regression and classification problems.
The main topics covered in this chapter are as follows:
- The bootstrapping method from computational statistics
- How bagging works
- Strengths and weaknesses of bagging
- Implementing a custom bagging ensemble
- Using the scikit-learn implementation