Key concepts for decision tree and random forest regression
Decision trees are an exceptionally useful machine learning tool. They have some of the same advantages as KNN – they are non-parametric, easy to interpret, and can work with a wide range of data – but without some of the limitations.
Decision trees group the observations in a dataset based on the values of their features. This is done with a series of binary decisions, starting from an initial split at the root node, and ending with a leaf for each grouping. All observations with the same values, or the same range of values, along the branches from the root node to that leaf, get the same predicted value for the target. When the target is numeric, that is the average value for the target for the training observations at that leaf. Figure 9.6 illustrates this:
This is a model of nightly hours of sleep for individuals...