Modifying gradient boosting hyperparameters
In this section, we will focus on the learning_rate
, the most important gradient boosting hyperparameter, with the possible exception of n_estimators
, the number of iterations or trees in the model. We will also survey some tree hyperparameters, and subsample
, which results in stochastic gradient boosting. In addition, we will use RandomizedSearchCV
and compare results with XGBoost.
learning_rate
In the last section, changing the learning_rate
value of GradientBoostingRegressor
from 1.0
to scikit-learn's default, which is 0.1
, resulted in enormous gains.
learning_rate
, also known as the shrinkage, shrinks the contribution of individual trees so that no tree has too much influence when building the model. If an entire ensemble is built from the errors of one base learner, without careful adjustment of hyperparameters, early trees in the model can have too much influence on subsequent development. learning_rate
limits the influence...