Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Savings automatically calculated. No voucher code required
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hyperparameter Tuning with Python

You're reading from  Hyperparameter Tuning with Python

Product type Book
Published in Jul 2022
Publisher Packt
ISBN-13 9781803235875
Pages 306 pages
Edition 1st Edition
Languages
Author (1):
Louis Owen Louis Owen
Profile icon Louis Owen
Toc

Table of Contents (19) Chapters close

Preface 1. Section 1:The Methods
2. Chapter 1: Evaluating Machine Learning Models 3. Chapter 2: Introducing Hyperparameter Tuning 4. Chapter 3: Exploring Exhaustive Search 5. Chapter 4: Exploring Bayesian Optimization 6. Chapter 5: Exploring Heuristic Search 7. Chapter 6: Exploring Multi-Fidelity Optimization 8. Section 2:The Implementation
9. Chapter 7: Hyperparameter Tuning via Scikit 10. Chapter 8: Hyperparameter Tuning via Hyperopt 11. Chapter 9: Hyperparameter Tuning via Optuna 12. Chapter 10: Advanced Hyperparameter Tuning with DEAP and Microsoft NNI 13. Section 3:Putting Things into Practice
14. Chapter 11: Understanding the Hyperparameters of Popular Algorithms 15. Chapter 12: Introducing Hyperparameter Tuning Decision Map 16. Chapter 13: Tracking Hyperparameter Tuning Experiments 17. Chapter 14: Conclusions and Next Steps 18. Other Books You May Enjoy

Discovering repeated k-fold cross-validation

Repeated k-fold cross-validation involves simply performing the k-fold cross-validation repeatedly, N times, with different randomizations in each repetition. The final evaluation score is the average of all scores from all folds of each repetition. This strategy will increase our confidence in our model.

So, why repeat the k-fold cross-validation? Why don't we just increase the value of k in k-fold? Surely, increasing the value of k will reduce the bias of our model's estimated performance. However, increasing the value of k will increase the variation, especially when we have a small number of samples. Therefore, usually, repeating the k-folds is a better way to gain higher confidence in our model's estimated performance. Of course, this comes with a drawback, which is the increase in computation time.

To implement this strategy, we can simply perform a manual for-loop, where we apply the k-fold cross-validation strategy to each loop. Fortunately, the Scikit-Learn package provide us with a specific function in which to implement this strategy:

from sklearn.model_selection import train_test_split, RepeatedKFold
df_cv, df_test = train_test_split(df, test_size=0.2, random_state=0)
rkf = RepeatedKFold(n_splits=4, n_repeats=3, random_state=0)
for train_index, val_index in rkf.split(df_cv):
df_train, df_val = df_cv.iloc[train_index], df_cv.iloc[val_index]
#perform training or hyperparameter tuning here

Choosing n_splits=4 and n_repeats=3 means that we will have 12 different train and validation sets. The final evaluation score is then just the average of all 12 scores. As you might expect, there is also a dedicated function to implement the repeated k-fold in a stratified fashion:

from sklearn.model_selection import train_test_split, RepeatedStratifiedKFold
df_cv, df_test = train_test_split(df, test_size=0.2, random_state=0, stratify=df['class'])
rskf = RepeatedStratifiedKFold(n_splits=4, n_repeats=3, random_state=0)
for train_index, val_index in rskf.split(df_cv, df_cv['class']):
df_train, df_val = df_cv.iloc[train_index], df_cv.iloc[val_index]
#perform training or hyperparameter tuning here

The RepeatedStratifiedKFold function will perform stratified k-fold cross-validation repeatedly, n_repeats times.

Now that you have learned another variation of the cross-validation strategy, called repeated k-fold cross-validation, let's learn about the other variations next.

You have been reading a chapter from
Hyperparameter Tuning with Python
Published in: Jul 2022 Publisher: Packt ISBN-13: 9781803235875
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime