Random forests
I really wanted to add this section on random forest classifiers, but not just because the name sounds so cool. While I may have been accused of stretching metaphors to the breaking point, this time, the name may have inspired the name of this type of decision tree process. We have learned how to make decision trees, and we have learned that they have some weak points. It is best if the data really belongs to distinct and differentiated groups. They are not very tolerant of noise in the data. And they really gets unwieldy if you want to scale them up – you can imagine how big a graph would get with 200 classes rather than the 6 or 7 we were dealing with.
If you want to take advantage of the simplicity and utility of decision trees but want to handle more data, more uncertainty, and more classes, you can use a random forest, which, just as the name indicates, is just a whole batch of randomly generated decision trees. Let’s step through the process:
...