Random forest and opaque models
Let’s train the random forest classifier based on the same data as in the counter-example and check whether the model performs better and whether the model uses similar features as the DecisionTree
classifier in the original counter-example.
Let’s instantiate, train, and validate the model on the same data using the following fragment of code:
from sklearn.ensemble import RandomForestClassifier randomForestModel = RandomForestClassifier() randomForestModel.fit(X_train, y_train) y_pred_rf = randomForestModel.predict(X_test)
After evaluating the model, we obtain the following performance metrics:
Accuracy: 0.62 Precision: 0.63, Recall: 0.62
Admittedly, these metrics are different than the metrics in the decision trees, but the overall performance is not that much different. The difference in accuracy of 0.03 is negligible. First, we can extract the important features, reusing the same techniques that were presented in Chapter...