Understanding and implementing random forests
Random forests is a predictive algorithm falling under the ambit of ensemble learning algorithms. Ensemble learning algorithms consist of a combination of various independent models (similar or different) to solve a particular prediction problem. The final result is calculated based on the results from all these independent models, which is better than the results of any of the independent models.
There are two kinds of ensemble algorithm, as follows:
Averaging methods: Several similar independent models are created (in the case of decision trees, it can mean trees with different depths or trees involving a certain variable and not involving the others, and so on.) and the final prediction is given by the average of the predictions of all the models.
Boosting methods: The goal here is to reduce the bias of the combined estimator by sequentially building it from the base estimators. A powerful model is created using several weak models.
Random forest...