Summary
In this chapter, we extended the concept of ensemble learning to a generic forward stage-wise additive model, where the task of each new estimator is to minimize a generic cost function. Considering the complexity of a full optimization, a gradient descent technique was presented that, combined with an estimator weight line search, can yield excellent performances, both in classification and in regression problems.
The remainder of the chapter covered how to build ensembles using a few strong learners, averaging their prediction or considering a majority vote. We discussed the main drawback of thresholded classifiers, and we showed how it's possible to build a soft-voting model that is able to trust the estimator that shows less uncertainty. Other useful topics are the stacking method, which consists of using an extra classifier to process the prediction of each member of the ensemble and how it's possible to create candidate ensembles that are evaluated...