Combining features with decision trees
In the winning solution of the Knowledge Discovery and Data Mining (KDD) competition in 2009, the authors created new features by combining two or more variables using decision trees. When examining the variables, they noticed that some features had a high level of mutual information with the target yet low correlation, indicating that the relationship with the target was not linear. While these features were predictive when used in tree-based algorithms, linear models could not take advantage of them. Hence, to use these features in linear models, they replaced the features with the outputs of decision trees trained on the individual features, or combinations of two or three variables, to return new features with a monotonic relationship with the target.
In short, combining features with decision trees is useful for creating features that show a monotonic relationship with the target, which is useful for making accurate predictions using linear...