We have seen how linear regression models allow us to predict a numerical outcome, and how logistic regression models allow us to predict a categorical outcome. However, both of these models assume a linear relationship between variables. Classification and Regression Trees (CART) overcome this problem by generating Decision Trees, which are also much easier to interpret compared to the supervised learning models we have seen so far. These decision trees can then be traversed to come to a final decision, where the outcome can either be numerical (regression trees) or categorical (classification trees). A simple classification tree used by a mortgage lender is illustrated in Figure 4.7:
When traversing decision trees, start at the top. Thereafter, traverse left for yes, or positive...