You're reading from The Deep Learning Architect's Handbook Build and deploy production-ready DL solutions leveraging the latest Python techniques

Product type Paperback

Published in Dec 2023

Publisher Packt

ISBN-13 9781803243795

Length 516 pages

Edition 1st Edition

Languages

Python

Tools

BERT

Concepts

Deep Learning

Author (1):

Ee Kin Chin

View More author details

Table of Contents (25) Chapters

Preface

1. Part 1 – Foundational Methods

2. Chapter 1: Deep Learning Life Cycle FREE CHAPTER

3. Chapter 2: Designing Deep Learning Architectures

4. Chapter 3: Understanding Convolutional Neural Networks

5. Chapter 4: Understanding Recurrent Neural Networks

6. Chapter 5: Understanding Autoencoders

7. Chapter 6: Understanding Neural Network Transformers

8. Chapter 7: Deep Neural Architecture Search

9. Chapter 8: Exploring Supervised Deep Learning

10. Chapter 9: Exploring Unsupervised Deep Learning

11. Part 2 – Multimodal Model Insights

12. Chapter 10: Exploring Model Evaluation Methods

13. Chapter 11: Explaining Neural Network Predictions

14. Chapter 12: Interpreting Neural Networks

15. Chapter 13: Exploring Bias and Fairness

16. Chapter 14: Analyzing Adversarial Performance

17. Part 3 – DLOps

18. Chapter 15: Deploying Deep Learning Models to Production

19. Chapter 16: Governing Deep Learning Models

20. Chapter 17: Managing Drift Effectively in a Dynamic Environment

21. Chapter 18: Exploring the DataRobot AI Platform

22. Chapter 19: Architecting LLM Solutions

23. Index

Why subscribe?

24. Other Books You May Enjoy

Developing deep learning models

Let’s start with a short recap of what deep learning is. Deep learning’s core foundational building block is a neural network. A neural network is an algorithm that was made to simulate the human brain. Its building blocks are called neurons, which mimic the billions of neurons the human brain contains. Neurons, in the context of neural networks, are objects that store simple information called weights and biases. Think of these as the memory of the algorithm.

Deep learning architectures are essentially neural network architectures that have three or more neural network layers. Neural network layers can be categorized into three high-level groups – the input layer, the hidden layer, and the output layer. The input layer is the simplest layer group and whose functionality is to pass the input data to subsequent layers. This layer group does not contain biases and can be considered passive neurons, but the group still contains weights in its connections to neurons from subsequent layers. The hidden layer comprises neurons that contain biases and weights in their connections to neurons from subsequent layers. Finally, the output layer comprises neurons that relate to the number of classes and problem types and contains bias. A best practice when counting neural network layers is to exclude the input layer when doing so. So, a neural network with one input layer, one hidden layer, and one output layer is considered to be a two-layer neural network. The following figure shows a basic neural network, called a multilayer perceptron (MLP), with a single input layer, a single hidden layer, and a single output layer:

Figure 1.12 – A simple deep learning architecture, also called an MLP

Being a subset of the wider machine learning category, deep learning models are capable of learning patterns from the data through a loss function and an optimizer algorithm that optimizes the loss function. A loss function defines the error made by the model so that its memory (weights and biases) can be updated to perform better in the next iteration. An optimizer algorithm is an algorithm that decides the strategy to update the weights given the loss value.

With this short recap, let’s dive into a summary of the common deep learning model families.

Deep learning model families

These layers can come in many forms as researchers have been able to invent new layer definitions to tackle new problem types and almost always comes with a non-linear activation function that allows the model to capture non-linear relationships between the data. Along with the variation of layers come many different deep learning architecture families that are meant for different problem types. A few of the most common and widely used deep learning models are as follows:

MLP for tabular data types. This will be explored in Chapter 2, Designing Deep Learning Architectures.
Convolutional neural network for image data types. This will be explored in Chapter 3, Understanding Convolutional Neural Networks.
Autoencoders for anomaly detection, data compression, data denoising, and feature representation learning. This will be explored in Chapter 5, Understanding Autoencoders.
Gated recurrent unit (GRU), Long Short-Term Memory (LSTM), and Transformers for sequence data types. These will be explored in Chapter 4, Understanding Recurrent Neural Networks, and Chapter 6, Understanding Neural Network Transformers, respectively.

These architectures will be the focus of Chapters 2 to 6, where we will discuss their methodology and go through some practical evaluation. Next, let’s discover the problem types we can tackle in deep learning.

The model development strategy

Today, deep learning models are easy to invent and create due to the advent of deep learning frameworks such as PyTorch and TensorFlow, along with their high-level library wrappers. Which framework you should choose at this point is a matter of preference regarding their interfaces as both frameworks are matured with years of improvement work done. Only when there is a pressing need for a very custom function to tackle a unique problem type will you need to choose the framework that can execute what you need. Once you’ve chosen your deep learning framework, the deep model creation, training, and evaluation process is pretty much covered all around.

However, model management functions do not come out of the box from these frameworks. Model management is an area of technology that allows teams, businesses, and deep learning practitioners to reliably, quickly, and effectively build models, evaluate models, deliver model insights, deploy models to production, and govern models. Model management can sometimes be referred to as machine learning operations (MLOps). You might still be wondering why you’d need such functionalities, especially if you’ve been building some deep learning models off Kaggle, a platform that hosts data and machine learning problems as competitions. So, here are some factors that drive the need to utilize these functionalities:

It is cumbersome to compare models manually:
- Manually typing performance data in an Excel sheet to keep track of model performance is slow and unreliable
Model artifacts are hard to keep track of:
- A model has many artifacts, such as its trained weights, performance graphs, feature importance, and prediction explanations
- It is also cumbersome to compare model artifacts
Model versioning is needed to make sure model-building experiments are not repeated:
- Overriding the top-performing model with the most reliable model insights is the last thing you want to experience
- Versioning should depend on the data partitioning method, model settings, and software library versions
It is not straightforward to deploy and govern models

Depending on the size of the team involved in the project and how often components need to be reused, different software and libraries would fit the bill. These software and libraries are split into paid and free (usually open sourced) categories. Metaflow, an open sourced software, is suitable for bigger data science teams where there are many chances of components needing to be reused across other projects and MLFlow (open sourced software) would be more suitable for small or single-person teams. Other notable model management tools are Comet (paid), Weights & Biases (paid), Neptune (paid), and Algorithmia (paid).

With that, we have provided a brief overview of deep learning model development methodology and strategy; we will dive deeper into model development topics in the next few chapters. But before that, let’s continue with an overview of the topic of delivering model insights.

You're reading from The Deep Learning Architect's Handbook Build and deploy production-ready DL solutions leveraging the latest Python techniques

Table of Contents (25) Chapters

Developing deep learning models

Deep learning model families

The model development strategy

Authors (1)

Personalised recommendations for you