Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Debugging Machine Learning Models with Python

You're reading from   Debugging Machine Learning Models with Python Develop high-performance, low-bias, and explainable machine learning and deep learning models

Arrow left icon
Product type Paperback
Published in Sep 2023
Publisher Packt
ISBN-13 9781800208582
Length 344 pages
Edition 1st Edition
Languages
Arrow right icon
Author (1):
Arrow left icon
Ali Madani Ali Madani
Author Profile Icon Ali Madani
Ali Madani
Arrow right icon
View More author details
Toc

Table of Contents (26) Chapters Close

Preface 1. Part 1:Debugging for Machine Learning Modeling
2. Chapter 1: Beyond Code Debugging FREE CHAPTER 3. Chapter 2: Machine Learning Life Cycle 4. Chapter 3: Debugging toward Responsible AI 5. Part 2:Improving Machine Learning Models
6. Chapter 4: Detecting Performance and Efficiency Issues in Machine Learning Models 7. Chapter 5: Improving the Performance of Machine Learning Models 8. Chapter 6: Interpretability and Explainability in Machine Learning Modeling 9. Chapter 7: Decreasing Bias and Achieving Fairness 10. Part 3:Low-Bug Machine Learning Development and Deployment
11. Chapter 8: Controlling Risks Using Test-Driven Development 12. Chapter 9: Testing and Debugging for Production 13. Chapter 10: Versioning and Reproducible Machine Learning Modeling 14. Chapter 11: Avoiding and Detecting Data and Concept Drifts 15. Part 4:Deep Learning Modeling
16. Chapter 12: Going Beyond ML Debugging with Deep Learning 17. Chapter 13: Advanced Deep Learning Techniques 18. Chapter 14: Introduction to Recent Advancements in Machine Learning 19. Part 5:Advanced Topics in Model Debugging
20. Chapter 15: Correlation versus Causality 21. Chapter 16: Security and Privacy in Machine Learning 22. Chapter 17: Human-in-the-Loop Machine Learning 23. Assessments 24. Index 25. Other Books You May Enjoy

Model and prediction-centric debugging

The predictions of a model in the training, testing, and production stages could help us detect issues with the models and find opportunities to improve them. Here, we will briefly review some aspects of model- and prediction-centric model debugging. You can read more details about these problems and other considerations in achieving a reliable model, how to identify the source of the issues, and how to resolve them in future chapters of this book.

Underfitting and overfitting

When we train a model, such as a supervised learning model, the goal is to have high performance not just in training but also in testing. When a model has low performance even in a training set, we need to deal with the issue of underfitting. We can develop more complicated models, such as a random forest or deep learning model, instead of linear and logistic regression models. More complex models might result in lower underfitting, but they might cause overfitting and result in lower generalizability of the prediction to test or production data (Figure 1.5):

Figure 1.5 – Schematic illustration of underfitting and overfitting

Figure 1.5 – Schematic illustration of underfitting and overfitting

Algorithm and hyperparameter selection determine the level of complexity and the chance of underfitting or overfitting when training and testing a machine learning model. For example, by choosing a model that can learn nonlinear patterns instead of linear models, your model could have a higher chance of low underfitting as it could identify more complex patterns in training data. But at the same time, you could increase the chance of overfitting as some of the complex patterns in the training data might not be generalizable to the test data (Figure 1.5). There are approaches to assess underfitting and overfitting that will help you develop a high-performance and generalizable model. We will discuss these in future chapters.

Model hyperparameters

Some parameters can affect the performance of a machine learning model that usually do not get optimized automatically in the training process. These are called hyperparameters. We will go through examples of such hyperparameters, such as the number of trees in a random forest model or the size of hidden layers in neural network models, in future chapters.

Inference in model testing and production

The eventual goal of machine learning modeling is to have a highly effective model in production. When we test the model, we are assessing its generalizability, but we cannot be sure about its performance on the data it has not seen. The data that’s used for training machine learning models could become out of date. For example, the changes in the trends of the clothing market could make predictions of a model for clothing recommendation unreliable.

There are different concepts in this topic, such as data variance, data drift, and model drift, all of which we will cover in the next few chapters.

Data or hyperparameters for changing landscapes

When we train a machine learning model with specific training data and a set of hyperparameters, the values of model parameters get changed so that they’re as close to an optimum point as possible for a defined objective or loss function. The two other tools to achieve a better model are providing better data for training and selecting better hyperparameters. Each algorithm has a capacity for performance improvement. By playing with model hyperparameters alone, you cannot develop the best possible model. In the same way, by increasing the quality and quantity of your data and keeping your model hyperparameters the same, you could also not achieve the best performance possible. So, data and hyperparameters come hand in hand. Before you read the next chapters, remember that by spending more time and money on hyperparameter optimization alone, you cannot necessarily get a better model. We will look at this in more detail later in this book.

You have been reading a chapter from
Debugging Machine Learning Models with Python
Published in: Sep 2023
Publisher: Packt
ISBN-13: 9781800208582
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image