Considering feature engineering
Let's assume that the non-profit has chosen to use the model whose features were selected with Lasso LARS with AIC (e-llarsic
) but would like to evaluate whether you can improve it further. Now that you have removed over 300 features that might have only marginally improved predictive performance but mostly added noise, you are left with more relevant features. However, you also know that 63 features selected by the GAs (a-ga-rf
) produced the same amount of RMSE as the 111 features. This means that while there's something in those extra features that improves profitability, it does not improve the RMSE.
From a feature selection standpoint, many things can be done to approach this problem. For instance, examine the overlap and difference of features between e-llarsic
and a-ga-rf
, and do feature selection variations strictly on those features to see whether the RMSE dips on any combination while keeping or improving on current profitability...