Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds
Arrow up icon
GO TO TOP
Practical Deep Learning at Scale with MLflow

You're reading from   Practical Deep Learning at Scale with MLflow Bridge the gap between offline experimentation and online production

Arrow left icon
Product type Paperback
Published in Jul 2022
Publisher Packt
ISBN-13 9781803241333
Length 288 pages
Edition 1st Edition
Tools
Arrow right icon
Author (1):
Arrow left icon
Yong Liu Yong Liu
Author Profile Icon Yong Liu
Yong Liu
Arrow right icon
View More author details
Toc

Table of Contents (17) Chapters Close

Preface 1. Section 1 - Deep Learning Challenges and MLflow Prime FREE CHAPTER
2. Chapter 1: Deep Learning Life Cycle and MLOps Challenges 3. Chapter 2: Getting Started with MLflow for Deep Learning 4. Section 2 –
Tracking a Deep Learning Pipeline at Scale
5. Chapter 3: Tracking Models, Parameters, and Metrics 6. Chapter 4: Tracking Code and Data Versioning 7. Section 3 –
Running Deep Learning Pipelines at Scale
8. Chapter 5: Running DL Pipelines in Different Environments 9. Chapter 6: Running Hyperparameter Tuning at Scale 10. Section 4 –
Deploying a Deep Learning Pipeline at Scale
11. Chapter 7: Multi-Step Deep Learning Inference Pipeline 12. Chapter 8: Deploying a DL Inference Pipeline at Scale 13. Section 5 – Deep Learning Model Explainability at Scale
14. Chapter 9: Fundamentals of Deep Learning Explainability 15. Chapter 10: Implementing DL Explainability with MLflow 16. Other Books You May Enjoy

Understanding DL code challenges

In this section, we will discuss DL code challenges. Let's look at how these code challenges are manifested in each of the stages described in Figure 1.3. In this section, and within the context of DL development, code refers to the source code that's written in certain programming languages such as Python for data processing and implementation, while a model refers to the model logic and architecture in its serialized format (for example, pickle, TorchScript, or ONNX):

  • Data collection/cleaning/annotation: While data is the central piece in this stage, the code that does the query, extraction/transformation/loading (ETL), and data cleaning and augmentation is of critical importance. We cannot decouple the development of the model from the data pipelines that provide the data feeds to the model. Therefore, data pipelines that implement ETL need to be treated as one of the integrated steps in both offline experimentation and online production. A common mistake is that we use different data ETL and cleaning pipelines in offline experimentation, and then implement different data ETL/cleaning pipelines in online production, which could cause different model behaviors. We need to version and serialize the data pipeline as part of the entire model pipeline. MLflow provides several ways to allow us to implement such multistep pipelines.
  • Model development: During offline experiments, in addition to different versions of data pipeline code, we might also have different versions of notebooks or use different versions of DL library code. The usage of notebooks is particularly unique in ML/DL life cycles. Tracking which model results are produced by which notebook/model pipeline/data pipeline needs to be done for each run. MLflow does that with automatic code version tracking and dependencies. In addition, code reproducibility in different running environments is unique to DL models, as DL models usually require hardware accelerators such as GPUs or TPUs. The flexibility of running either locally, or remotely, on a CPU or GPU environment is of great importance. MLflow provides a lightweight approach in which to organize the ML projects so that code can be written once and run everywhere.
  • Model deployment and serving in production: While the model is serving production traffic, any bugs will need to be traced back to both the model and code. Thus, tracking code provenance is critical. It is also critical to track all the dependency code library versions for a particular version of the model.
  • Model validation and A/B testing: Online experiments could use multiple versions of models using different data feeds. Debugging any experimentation will require not only knowing which model is used but also which code is used to produce that model.
  • Monitoring and feedback loops: This stage is similar to the previous stage in terms of code challenges, where we need to know whether model degradation is due to code bugs or model and data drifting. The monitoring pipeline needs to collect all the metrics for both data and model performance.

In summary, DL code challenges are especially unique because DL frameworks are still evolving (for example, TensorFlow, PyTorch, Keras, Hugging Face, and SparkNLP). MLflow provides a lightweight framework to overcome many common challenges and can interface with many DL frameworks seamlessly.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime