Implementing gradient boosting
In this section, we will try to improve our random forest model using gradient boosting. One thing we will have to watch out for is overfitting, which can be more of an issue with gradient boosting decision trees than with random forests. This is because the trees for random forests do not learn from other trees, whereas with gradient boosting, each tree builds on the learning of previous trees. Our choice of hyperparameters here is key. Let’s get started:
- We will start by importing the necessary libraries. We will use the same modules we used for random forests, except we will import
GradientBoostingClassifier
fromensemble
rather thanRandomForestClassifier
:import pandas as pd import numpy as np from imblearn.pipeline import make_pipeline from sklearn.model_selection import RandomizedSearchCV from sklearn.ensemble import GradientBoostingClassifier import sklearn.metrics as skmet from scipy.stats import uniform from scipy.stats import...