Summary
Boosting is yet another ramification of decision trees. It is a sequential iteration technique where the error from a previous iteration is targeted with more impunity. We began with the important adaptive boosting algorithm and used very simple toy data to illustrate the underpinnings. The approach was then extended to the regression problem and we illustrated the gradient boosting method with two different approaches. The two packages adabag
and gbm
were briefly elaborated on and the concept of variable importance was emphasized yet again. For the spam dataset, we got more accuracy with boosting and hence the deliberations of the boosting algorithm are especially more useful.
The chapter considered different variants of the boosting algorithm. However, we did not discuss why it works at all. In the next chapter, these aspects will be covered in more detail.