Summary
In this chapter, we looked at why ensemble works in the context of classification problems. A series of detailed programs illustrated the point that each classifier must be better than a random guess. We considered scenarios where all the classifiers have the same accuracy, different accuracy, and finally a scenario with completely arbitrary accuracies. Majority and weighted voting was illustrated within the context of the random forest and bagging methods. For the regression problem, we used a different choice of base learners and allowed them to be heterogeneous. Simple and weighted averaging methods were illustrated in relation to the housing sales price data. A simple illustration of stacked regression ultimately concluded the technical section of this chapter.
In the following chapter, we will look at ensembling diagnostics.