Bagging is generally used to reduce variance of a model. It achieves it by creating an ensemble of base learners, each one trained on a unique bootstrap sample of the original train set. This forces diversity between the base learners. Random Forests expand on bagging by inducing randomness not only on each base learner's train samples, but in the features as well. Furthermore, their performance is similar to boosting techniques, although they do not require as much fine-tuning as boosting methods.
In this chapter, we will provide the basic background of random forests, as well as discuss the strengths and weaknesses of the method. Finally, we will present usage examples, using the scikit-learn implementation. The main topics covered in this chapter are as follows:
- How Random Forests build their base learners
- How randomness can be utilized in order to build...