The gbm package
The R gbm
package, created by Greg Ridgeway, is a very versatile package. The details of this package can be found at http://www.saedsayad.com/docs/gbm2.pdf. The document details the theoretical aspects of the gradient boosting and illustrates various other parameters of the gbm
function. First, we will consider the shrinkage factor available in the gbm
function.
Shrinkage parameters are very important, and also help with the problem of overfitting. Penalization is achieved through this option. For the spam dataset, we will set the shrinkage option to 0.1 (very large) and 0.0001 (very small) and also look at how the performance is affected:
> spam_Train2 <- spam_Train > spam_Train2$type <- as.numeric(spam_Train2$type)-1 > spam_gbm <- gbm(spam_Formula,distribution="bernoulli", + data=spam_Train2, n.trees=500,bag.fraction = 0.8, + shrinkage = 0.1) > plot(spam_gbm) # output suppressed > summary(spam_gbm) ...