Summary
This chapter covered two classification methods that partition the data according to values of the features. Decision trees use a divide-and-conquer strategy to create flowcharts, while rule learners separate-and-conquer data to identify logical if-else
rules. Both methods produce models that can be understood without a statistical background.
One popular and highly-configurable decision tree algorithm is C5.0. We used the C5.0 algorithm to create a tree to predict whether a loan applicant will default. Using options for boosting and cost-sensitive errors, we were able to improve our accuracy and avoid risky loans that cost the bank more money.
We also used two rule learners, 1R and RIPPER, to develop rules for identifying poisonous mushrooms. The 1R algorithm used a single feature to achieve 99 percent accuracy in identifying potentially-fatal mushroom samples. On the other hand, the set of nine rules generated by the more sophisticated RIPPER algorithm correctly identified the edibility...