Gradient Boosting and XGBoost
What Is Boosting?
Boosting is a procedure for creating ensembles of many machine learning models, or estimators, similar to the bagging concept that underlies the random forest model. Like bagging, while boosting can be used with any kind of machine learning model, it is commonly used to build ensembles of decision trees. A key difference from bagging is that in boosting, each new estimator added to the ensemble depends on all the estimators added before it. Because the boosting procedure proceeds in sequential stages, and the predictions of ensemble members are added up to calculate the overall ensemble prediction, it is also called stagewise additive modeling. The difference between bagging and boosting can be visualized as in Figure 6.1:
While bagging trains many estimators using different random samples of the training data, boosting trains new estimators using information about which...