We'll now get to an overview of the techniques, covering classification trees, random forests, and gradient boosting. This will set the stage for their practical use.
An overview of the techniques
Understanding a regression tree
To establish an understanding of tree-based methods, it's probably easier to start with a quantitative outcome and then move on to how it works in a classification problem. The essence of a tree is that the features are partitioned, starting with the first split that improves the RSS the most. These binary splits continue until the termination of the tree. Each subsequent split/partition isn't done on the entire dataset, but only on the portion of the prior split that it falls under...