Random Forest
The decision tree algorithm that we saw earlier faced the problem of overfitting. Since we fit only one tree on the training data, there is a high chance that the tree will overfit the data without proper pruning. The random forest algorithm reduces variance/overfitting by averaging multiple decision trees, which individually suffer from high variance.
Random forest is an ensemble method of supervised machine learning. Ensemble methods combine predictions obtained from multiple base estimators/classifiers to improve the overall prediction/robustness. Ensemble methods are divided into the following two types:
Bagging: The data is randomly divided into several subsets and the model is trained over each of these subsets. Several estimators are built independently from each other and then the predictions are averaged together, which ultimately helps to reduce variance (overfitting).
Boosting: In the case of boosting, base estimators are built sequentially and each model built is...