You're reading from The Deep Learning Architect's Handbook Build and deploy production-ready DL solutions leveraging the latest Python techniques

Product type Paperback

Published in Dec 2023

Publisher Packt

ISBN-13 9781803243795

Length 516 pages

Edition 1st Edition

Languages

Python

Tools

BERT

Concepts

Deep Learning

Author (1):

Ee Kin Chin

View More author details

Table of Contents (25) Chapters

Preface

1. Part 1 – Foundational Methods

2. Chapter 1: Deep Learning Life Cycle FREE CHAPTER

3. Chapter 2: Designing Deep Learning Architectures

4. Chapter 3: Understanding Convolutional Neural Networks

5. Chapter 4: Understanding Recurrent Neural Networks

6. Chapter 5: Understanding Autoencoders

7. Chapter 6: Understanding Neural Network Transformers

8. Chapter 7: Deep Neural Architecture Search

9. Chapter 8: Exploring Supervised Deep Learning

10. Chapter 9: Exploring Unsupervised Deep Learning

11. Part 2 – Multimodal Model Insights

12. Chapter 10: Exploring Model Evaluation Methods

13. Chapter 11: Explaining Neural Network Predictions

14. Chapter 12: Interpreting Neural Networks

15. Chapter 13: Exploring Bias and Fairness

16. Chapter 14: Analyzing Adversarial Performance

17. Part 3 – DLOps

18. Chapter 15: Deploying Deep Learning Models to Production

19. Chapter 16: Governing Deep Learning Models

20. Chapter 17: Managing Drift Effectively in a Dynamic Environment

21. Chapter 18: Exploring the DataRobot AI Platform

22. Chapter 19: Architecting LLM Solutions

23. Index

Why subscribe?

24. Other Books You May Enjoy

Uncovering transformer improvements using only the decoder

Recall that the decoder block of the transformer focuses on an autoregressive structure. For the decoder-only transformer line of models, the task of predicting tokens autoregressively remains the same. With the removal of the encoder, the architecture has to adapt its input to accept more than one sentence, similar to what BERT does. Starting, ending, and separator tokens are used to encode input data sequentially. Masking is still performed to prevent the model from depending on the current token to predict future tokens from the input data during predictions, which is similar to the original transformer along with positional embeddings.

Diving into the GPT model family

All these architectural concepts were introduced by the GPT model in 2018, which is short for generative pre-training. As the name suggests, GPT also adopts unsupervised pre-training as the initial stage and subsequently moves into the supervised fine...

The rest of the chapter is locked

You're reading from The Deep Learning Architect's Handbook Build and deploy production-ready DL solutions leveraging the latest Python techniques

Table of Contents (25) Chapters

Uncovering transformer improvements using only the decoder

Diving into the GPT model family

Authors (1)

Personalised recommendations for you

You're reading from The Deep Learning Architect's Handbook Build and deploy production-ready DL solutions leveraging the latest Python techniques

Table of Contents (25) Chapters

Uncovering transformer improvements using only the decoder

Diving into the GPT model family

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you