Chapter 3. Bagging
Decision trees were introduced in Chapter 1, Introduction to Ensemble Techniques, and then applied to five different classification problems. Here, they can be seen to work better for some databases more than others. We had almost only used the default
settings for the rpart
function when constructing decision trees. This chapter begins with the exploration of some options that are likely to improve the performance of the decision tree. The previous chapter introduced the bootstrap
method, used mainly for statistical methods and models. In this chapter, we will use it for trees. The method is generally accepted as a machine learning technique. Bootstrapping decision trees is widely known as bagging. A similar kind of classification method is k-nearest neighborhood classification, abbreviated as k-NN. We will introduce this method in the third section and apply the bagging technique for this method in the concluding section of the chapter.
In this chapter, we...