Discretization with decision trees consists of using a decision tree to identify the optimal bins in which to sort the variable values. The decision tree is built using the variable to discretize, and the target. When a decision tree makes a prediction, it assigns an observation to one of N end leaves, therefore, any decision tree will generate a discrete output, the values of which are the predictions at each of its N leaves. Discretization with decision trees creates a monotonic relationship between the bins and the target. In this recipe, we will perform decision tree-based discretization using scikit-learn and then automate the procedure with Feature-engine.
Using decision trees for discretization
Getting ready
Discretization...