Gradient-boosted decision trees
Gradient boosting is an ensemble learning methodology that combines multiple models sequentially to produce a more robust ensemble model. Unlike bagging, where multiple strong models are used (in parallel), with boosting, multiple weak learners are trained, each learning from the mistakes of those before it to build a more accurate and robust ensemble model. Another distinct difference from bagging is that each model uses the entire dataset for training.
Note
As discussed next, gradient boosting always builds a series of regression trees to form part of the ensemble, regardless of whether a regression or classification problem is solved. Gradient boosting is also called Multiple Additive Regression Trees (MART).
Abstractly, the boosting process starts with a weak base learner. In the case of decision trees, the base learner might have only a single split (also known as a decision stump). The error residuals (the difference between the predicted...