Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Building LLM Powered  Applications

You're reading from   Building LLM Powered Applications Create intelligent apps and agents with large language models

Arrow left icon
Product type Paperback
Published in May 2024
Publisher Packt
ISBN-13 9781835462317
Length 342 pages
Edition 1st Edition
Languages
Tools
Arrow right icon
Author (1):
Arrow left icon
Valentina Alto Valentina Alto
Author Profile Icon Valentina Alto
Valentina Alto
Arrow right icon
View More author details
Toc

Table of Contents (16) Chapters Close

Preface 1. Introduction to Large Language Models 2. LLMs for AI-Powered Applications FREE CHAPTER 3. Choosing an LLM for Your Application 4. Prompt Engineering 5. Embedding LLMs within Your Applications 6. Building Conversational Applications 7. Search and Recommendation Engines with LLMs 8. Using LLMs with Structured Data 9. Working with Code 10. Building Multimodal Applications with LLMs 11. Fine-Tuning Large Language Models 12. Responsible AI 13. Emerging Trends and Innovations 14. Other Books You May Enjoy
15. Index

Base models versus customized models

The nice thing about LLMs is that they have been trained and ready to use. As we saw in the previous section, training an LLM requires great investment in hardware (GPUs or TPUs) and it might last for months, and these two factors might mean it is not feasible for individuals and small businesses.

Luckily, pre-trained LLMs are generalized enough to be applicable to various tasks, so they can be consumed without further tuning directly via their REST API (we will dive deeper into model consumption in the next chapters).

Nevertheless, there might be scenarios where a general-purpose LLM is not enough, since it lacks domain-specific knowledge or doesn’t conform to a particular style and taxonomy of communication. If this is the case, you might want to customize your model.

How to customize your model

There are three main ways to customize your model:

  • Extending non-parametric knowledge: This allows the model to access external sources of information to integrate its parametric knowledge while responding to the user’s query.

Definition

LLMs exhibit two types of knowledge: parametric and non-parametric. The parametric knowledge is the one embedded in the LLM’s parameters, deriving from the unlabeled text corpora during the training phase. On the other hand, non-parametric knowledge is the one we can “attach” to the model via embedded documentation. Non-parametric knowledge doesn’t change the structure of the model, but rather, allows it to navigate through external documentation to be used as relevant context to answer the user’s query.

This might involve connecting the model to web sources (like Wikipedia) or internal documentation with domain-specific knowledge. The connection of the LLM to external sources is called a plug-in, and we will be discussing it more deeply in the hands-on section of this book.

  • Few-shot learning: In this type of model customization, the LLM is given a metaprompt with a small number of examples (typically between 3 and 5) of each new task it is asked to perform. The model must use its prior knowledge to generalize from these examples to perform the task.

Definition

A metaprompt is a message or instruction that can be used to improve the performance of LLMs on new tasks with a few examples.

  • Fine tuning: The fine-tuning process involves using smaller, task-specific datasets to customize the foundation models for particular applications.

This approach differs from the first ones because, with fine-tuning, the parameters of the pre-trained model are altered and optimized toward the specific task. This is done by training the model on a smaller labeled dataset that is specific to the new task. The key idea behind fine-tuning is to leverage the knowledge learned from the pre-trained model and fine-tune it to the new task, rather than training a model from scratch.

A diagram of a data processing process

Description automatically generated

Figure 1.12: Illustration of the process of fine-tuning

In the preceding figure, you can see a schema on how fine-tuning works on OpenAI pre-built models. The idea is that you have available a pre-trained model with general-purpose weights or parameters. Then, you feed your model with custom data, typically in the form of “key-value” prompts and completions:

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...

Once the training is done, you will have a customized model that is particularly performant for a given task, for example, the classification of your company’s documentation.

The nice thing about fine-tuning is that you can make pre-built models tailored to your use cases, without the need to retrain them from scratch, yet leveraging smaller training datasets and hence less training time and compute. At the same time, the model keeps its generative power and accuracy learned via the original training, the one that occurred to the massive dataset.

In Chapter 11, Fine-Tuning Large Language Models, we will focus on fine-tuning your model in Python so that you can test it for your own task.

On top of the above techniques (which you can also combine among each other), there is a fourth one, which is the most “drastic.” It consists of training an LLM from scratch, which you might want to either build on your own or initialize from a pre-built architecture. We will see how to approach this technique in the final chapters.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime