Putting it all together
Now that we have a tested transformer, it is time to put it into action. Using what we have learned so far, we create a Pipeline, set the first step to the MeanDiscrete
transformer, and the second step to a Decision Tree Classifier. We then run a cross-validation and print out the result. Let's look at the code:
from sklearn.pipeline import Pipeline pipeline = Pipeline([('mean_discrete', MeanDiscrete()), ('classifier', DecisionTreeClassifier(random_state=14))]) scores_mean_discrete = cross_val_score(pipeline, X, y, scoring='accuracy') print("Mean Discrete performance: {0:.3f}".format(scores_mean_discrete.mean()))
The result is 0.917, which is not as good as before, but very good for a simple binary feature model.