By creating a number of trees using any valid randomization method, we have essentially created a forest, hence the algorithm's name. After generating the ensemble's trees, their predictions must be combined in order to have a functional ensemble. This is usually achieved through majority voting for classification problems and through averaging for regression problems. There are a number of hyperparameters associated with Random Forests, such as the number of features to consider at each node split, the number of trees in the forest, and the individual tree's size. As mentioned earlier, a good starting point for the number of features to consider is as follows:
- The square root of the number of total features for classification problems
- One-third of the number of total features for regression problems
The total number of trees can be fine-tuned...