Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
The Regularization Cookbook

You're reading from   The Regularization Cookbook Explore practical recipes to improve the functionality of your ML models

Arrow left icon
Product type Paperback
Published in Jul 2023
Publisher Packt
ISBN-13 9781837634088
Length 424 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Vincent Vandenbussche Vincent Vandenbussche
Author Profile Icon Vincent Vandenbussche
Vincent Vandenbussche
Arrow right icon
View More author details
Toc

Table of Contents (14) Chapters Close

Preface 1. Chapter 1: An Overview of Regularization 2. Chapter 2: Machine Learning Refresher FREE CHAPTER 3. Chapter 3: Regularization with Linear Models 4. Chapter 4: Regularization with Tree-Based Models 5. Chapter 5: Regularization with Data 6. Chapter 6: Deep Learning Reminders 7. Chapter 7: Deep Learning Regularization 8. Chapter 8: Regularization with Recurrent Neural Networks 9. Chapter 9: Advanced Regularization in Natural Language Processing 10. Chapter 10: Regularization in Computer Vision 11. Chapter 11: Regularization in Computer Vision – Synthetic Image Generation 12. Index 13. Other Books You May Enjoy

Performing hyperparameter optimization

In this recipe, we will explain what hyperparameter optimization is and some related concepts: the definition of a hyperparameter, cross-validation, and various hyperparameter optimization methods. We will then perform a grid search to optimize the hyperparameters of the logistic regression task on the Titanic dataset.

Getting ready

Most of the time, in ML, we do not simply train a model on the training set and evaluate it against the test set.

This is because, like most other algorithms, ML algorithms can be fine-tuned. This fine-tuning process allows us to optimize hyperparameters to achieve the best possible results. This sometimes acts as leverage so that we can regularize a model.

Note

In ML, hyperparameters can be tuned by humans, unlike parameters, which are learned through the model training process, and thus can’t be tuned.

To properly optimize hyperparameters, a third split has to be introduced: the validation set.

This means there are now three splits:

  • The training set: Where the model is trained
  • The validation set: Where the hyperparameters are optimized
  • The test set: Where the model is evaluated

You could create such a set by splitting X_train into X_train and X_valid with the train_test_split() function from scikit-learn.

But in practice, most people just use cross-validation and do not bother creating this validation set. The k-fold cross-validation method allows us to make k splits out of the training set and divide it, as presented in Figure 2.8:

Figure 2.8 – Typical split between training, validation, and test sets, without cross-validation (top) and with cross-validation (bottom)

Figure 2.8 – Typical split between training, validation, and test sets, without cross-validation (top) and with cross-validation (bottom)

In doing so, not just one model is trained, but k, for a given set of hyperparameters. The performances are averaged over those k models, based on a chosen metric (for example, accuracy, MSE, and so on).

Several sets of hyperparameters can then be tested, and the one that shows the best performance is selected. After selecting the best hyperparameter set, the model is trained one more time on the entire train set to maximize the data for training purposes.

Finally, you can implement several strategies to optimize the hyperparameters, as follows:

  • Grid search: Test all combinations of the provided values of hyperparameters
  • Random search: Randomly search combinations of hyperparameters
  • Bayesian search: Perform Bayesian optimization on the hyperparameters

How to do it…

While being rather complicated to explain conceptually, hyperparameter optimization with cross-validation is super easy to implement. In this recipe, we’ll assume that we want to optimize a logistic regression model to predict whether a passenger would have survived:

  1. First, we need to import the GridSearchCV class from sklearn.model_selection.
  2. We would like to test the following hyperparameter values for C: [0.01, 0.03, 0.1]. We must define a parameter grid with the hyperparameter as the key and the list of values to test as the value.

The C hyperparameter is the inverse of the penalization strength: the higher C is, the lower the regularization. See the next chapter for more details:

# Define the hyperparameters we want to test
param_grid = { 'C': [0.01, 0.03, 0.1] }
  1. Finally, let’s assume we want to optimize our model on accuracy, with five cross-validation folds. To do this, we will instantiate the GridSearchCV object and provide the following arguments:
    • The model to optimize, which is a LogisticRegression instance
    • The parameter grid, param_grid, which we defined previously
    • The scoring on which to optimize – that is, accuracy
    • The number of cross-validation folds, which has been set to 5 here
  2. We must also set return_train_score to True to get some useful information we can use later:
    # Instantiate the grid search object
    grid = GridSearchCV(
        LogisticRegression(),
        param_grid,
        scoring='accuracy',
        cv=5,
        return_train_score=True
    )
  3. Finally, all we have to do is train this object on the train set. This will automatically make all the computations and store the results:
    # Fit and wait
    grid.fit(X_train, y_train)
    GridSearchCV(cv=5, estimator=LogisticRegression(),
        param_grid={'C': [0.01, 0.03, 0.1]},
        return_train_score=True, scoring='accuracy')

Note

Depending on the input dataset and the number of tested hyperparameters, the fit may take some time.

Once the fit has been completed, you can get a lot of useful information, such as the following:

  • The hyperparameter set via the .best_params attribute
  • The best accuracy score via the .best_score attribute
  • The cross-validation results via the .cv_results attribute
  1. Finally, you can infer the model that was trained with optimized hyperparameters using the .predict() method:
    y_pred = grid.predict(X_test)
  2. Optionally, you can evaluate the chosen model with the accuracy score:
    print('Hyperparameter optimized accuracy:',
        accuracy_score(y_pred, y_test))

This provides the following output:

Hyperparameter optimized accuracy: 0.781229050279329

Thanks to the tools provided by scikit-learn, it is fairly easy to have a well-optimized model and evaluate it against several metrics. In the next recipe, we’ll learn how to diagnose bias and variance based on such an evaluation.

See also

The documentation for GridSearchCV can be found at https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html.

You have been reading a chapter from
The Regularization Cookbook
Published in: Jul 2023
Publisher: Packt
ISBN-13: 9781837634088
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image