Tuning and scaling XGBClassifier
In this section, we will fine-tune and scale XGBClassifier to obtain the best possible recall_score
value for the Exoplanets dataset. First, you will adjust weights using scale_pos_weight
, then you will run grid searches to find the best combination of hyperparameters. In addition, you will score models for different subsets of the data before consolidating and analyzing the results.
Adjusting weights
In Chapter 5, XGBoost Unveiled, you used the scale_pos_weight
hyperparameter to counteract imbalances in the Higgs boson dataset. Scale_pos_weight
is a hyperparameter used to scale the positive weight. The emphasis here on positive is important because XGBoost assumes that a target value of 1
is positive and a target value of 0
is negative.
In the Exoplanet dataset, we have been using the default 1
as negative and 2
as positive as provided by the dataset. We will now switch to 0
as negative and 1
as positive using the .replace()
method.