Regularizing a decision tree
In this recipe, we will look at the means to regularize decision trees. We will review and comment on a couple of methods for reference and provide a few more to be explored.
Getting ready
Obviously, we cannot use L1 or L2 regularization as we did with linear models. Since we have no weights for the features and no overall loss such as the mean squared error or the binary cross entropy, it is not possible to apply this method here.
But we do have other ways to regularize, such as the max depth of the tree, the minimum number of samples per leaf, the minimum number of samples per split, the max number of features, or the minimum impurity decrease. In this recipe, we will look at those.
To do that, we only need the following libraries: scikit-learn, matplotlib
and NumPy
. Also, since we will provide some visualization to give some idea of regularization, we will use the following plot_decision_function
function:
def plot_decision_function(dt...