Chapter 8. Ensemble Diagnostics
In earlier chapters, ensemble methods were found to be effective. In the previous chapter, we looked at scenarios in which ensemble methods increase the overall accuracy of a prediction. It has previously been assumed that different base learners are independent of each other. However, unless we have a very large sample and the base models are learners that use a distinct set of observations, such an assumption is very impractical. Even if we had a large enough sample to believe that the partitions are nonoverlapping, each base model is built on a different partition, and each partition carries with it the same information as any other partition. However, it is difficult to test validations such as this, so we need to employ various techniques in order to validate the independence of the base models on the same dataset. To do this, we will look at various different methods. A brief discussion of the need for ensemble diagnostics will kick off this...