XGBoost Hyperparameters
Early Stopping
When training ensembles of decision trees with XGBoost, there are many options available for reducing overfitting and leveraging the bias-variance trade-off. Early stopping is a simple one of these and can help provide an automated answer to the question "How many boosting rounds are needed?". It's important to note that early stopping relies on having a separate validation set of data, aside from the training set. However, this validation set will actually be used during the model training process, so it does not qualify as "unseen" data that was held out from model training, similar to how we used validation sets in cross-validation to select model hyperparameters in Chapter 4, The Bias-Variance Trade-Off.
When XGBoost is training successive decision trees to reduce error on the training set, it's possible that adding more and more trees to the ensemble will provide increasingly better fits to the training...