Pruning a recursive partitioning tree
In previous recipes, we built a complex decision tree for the churn
dataset. However, sometimes we have to remove sections that are not powerful in classifying instances to avoid over-fitting and to improve prediction accuracy. Therefore, in this recipe, we introduce the cost complexity pruning method to prune the classification tree.
Getting ready
You need to have the previous recipe completed by generating a classification model, and to assign the model into the churn.rp
variable.
How to do it...
Perform the following steps to prune the classification tree:
- Find the minimum cross-validation error of the classification tree model:
> min(churn.rp$cptable[,"xerror"]) Output [1] 0.4707602
- Locate the record with the minimum cross-validation errors:
> which.min(churn.rp$cptable[,"xerror"]) Output 7
- Get the cost complexity parameter of the record with the minimum cross-validation errors:
> churn...