Feature importance from tree-based methods
Feature importance, also called variable importance, can be calculated from tree-based methods by summing the reduction in Gini or entropy over all the trees for each variable.
So, if a particular variable is used to split the data and reduces the Gini or entropy value by a large amount, that feature is important for making predictions. This is a nice contrast to using coefficient-based feature importance from logistic or linear regression, because tree-based feature importances are non-linear. There are other ways of calculating feature importance as well, such as permutation feature importance and SHAP (SHapley Additive exPlanations).
Using H2O for feature importance
We can easily get the importances with drf.varimp()
, or plot them with drf.varimp_plot(server=True)
. The server=True
argument uses matplotlib
, which allows us to do things such as directly saving the figure with plt.savefig()
. The result looks like this: