You're reading from Decoding Large Language Models An exhaustive guide to understanding, implementing, and optimizing LLMs for NLP applications

Product type Paperback

Published in Oct 2024

Publisher Packt

ISBN-13 9781835084656

Length 396 pages

Edition 1st Edition

Concepts

Data Science

Author (1):

Irena Cronin

View More author details

Table of Contents (22) Chapters

Preface

1. Part 1: The Foundations of Large Language Models (LLMs)

2. Chapter 1: LLM Architecture FREE CHAPTER

3. Chapter 2: How LLMs Make Decisions

4. Part 2: Mastering LLM Development

5. Chapter 3: The Mechanics of Training LLMs

6. Chapter 4: Advanced Training Strategies

7. Chapter 5: Fine-Tuning LLMs for Specific Applications

8. Chapter 6: Testing and Evaluating LLMs

9. Part 3: Deployment and Enhancing LLM Performance

10. Chapter 7: Deploying LLMs in Production

11. Chapter 8: Strategies for Integrating LLMs

12. Chapter 9: Optimization Techniques for Performance

13. Chapter 10: Advanced Optimization and Efficiency

14. Part 4: Issues, Practical Insights, and Preparing for the Future

15. Chapter 11: LLM Vulnerabilities, Biases, and Legal Implications

16. Chapter 12: Case Studies – Business Applications and ROI

17. Chapter 13: The Ecosystem of LLM Tools and Frameworks

18. Chapter 14: Preparing for GPT-5 and Beyond

19. Chapter 15: Conclusion and Looking Forward

20. Index

Why subscribe?

21. Other Books You May Enjoy

Case study – optimizing the ExpressText LLM for mobile deployment

In this section, let’s go through a hypothetical case study that exemplifies the optimization of an LLM for mobile deployment.

Background

ExpressText is a state-of-the-art LLM designed for NLP tasks, including translation and summarization. Despite its effectiveness, the model’s size and computational demands limit its deployment on mobile devices.

Objective

The objective was to optimize ExpressText for mobile deployment, ensuring that it retains high accuracy while achieving a smaller size and faster inference on mobile hardware.

Methodology

Three main optimization techniques were applied:

Quantization: The model’s 32-bit floating-point weights were converted to 8-bit integers, significantly reducing its size. Quantization-aware training was employed to minimize accuracy loss.
Pruning: Using iterative magnitude-based pruning, weights with the smallest absolute...

The rest of the chapter is locked

You're reading from Decoding Large Language Models An exhaustive guide to understanding, implementing, and optimizing LLMs for NLP applications

Table of Contents (22) Chapters

Case study – optimizing the ExpressText LLM for mobile deployment

Background

Objective

Methodology

Authors (1)

Personalised recommendations for you

You're reading from Decoding Large Language Models An exhaustive guide to understanding, implementing, and optimizing LLMs for NLP applications

Table of Contents (22) Chapters

Case study – optimizing the ExpressText LLM for mobile deployment

Background

Objective

Methodology

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you