Combining features with decision trees
In the winning solution of the KDD competition in 2009, the authors created new features by combining two or more variables using decision trees. When examining the variables, they noticed that some features had a high level of mutual information with the target yet low correlation, indicating that the relationship with the target was not monotonic. While these features were predictive when used in tree-based algorithms, linear models could not take advantage of them. Hence, to use these features in linear models, they replaced the features with the outputs of decision trees trained on the individual features, or combinations of two or three variables, to return new features with a monotonic relationship with the target.
So, in short, combining features with decision trees is particularly useful for deriving features that are monotonic with the target, which is convenient for linear models. The procedure consists of training a decision tree...