The xgboost package
The xgboost
R package is an optimized, distributed implementation of the gradient boosting method. This is an engineering optimization that is known to be efficient, flexible, and portable—see https://github.com/dmlc/xgboost for more details and regular updates. This provides parallel tree boosting, and therefore has been found to be immensely useful in the data science community. This is especially the case given that a great fraction of the competition winners at www.kaggle.org use the xgboost
technique. A partial list of Kaggle winners is available at https://github.com/dmlc/xgboost/tree/master/demo#machine-learning-challenge-winning-solutions.
The main advantages of the extreme gradient boosting implementation are shown in the following:
- Parallel computing: This package is enabled with parallel processing using OpenMP, which then uses all the cores of the computing machine
- Regularization: This helps in circumventing the problem of overfitting by incorporating...