Introduction to Large Language Models
Dear reader, welcome to Building Large Language Model Applications! In this book, we will explore the fascinating world of a new era of application developments, where large language models (LLMs) are the main protagonists.
During the last year, we all learned the power of generative artificial intelligence (AI) tools such as ChatGPT, Bing Chat, Bard, and Dall-E. What impressed us the most was their stunning capabilities of generating human-like content based on user requests made in natural language. It is, in fact, their conversational capabilities that made them so easily consumable and, therefore, popular as soon as they entered the market. Thanks to this phase, we learned to acknowledge the power of generative AI and its core models: LLMs. However, LLMs are more than language generators. They can be also seen as reasoning engines that can become the brains of our intelligent applications.
In this book, we will see the theory and practice of how to build LLM-powered applications, addressing a variety of scenarios and showing new components and frameworks that are entering the domain of software development in this new era of AI. The book will start with Part 1, where we will introduce the theory behind LLMs, the most promising LLMs in the market right now, and the emerging frameworks for LLMs-powered applications. Afterward, we will move to a hands-on part where we will implement many applications using various LLMs, addressing different scenarios and real-world problems. Finally, we will conclude the book with a third part, covering the emerging trends in the field of LLMs, alongside the risk of AI tools and how to mitigate them with responsible AI practices.
So, let’s dive in and start with some definitions of the context we are moving in. This chapter provides an introduction and deep dive into LLMs, a powerful set of deep learning neural networks that feature the domain of generative AI.
In this chapter, we will cover the following topics:
- Understanding LLMs, their differentiators from classical machine learning models, and their relevant jargon
- Overview of the most popular LLM architectures
- How LLMs are trained and consumed
- Base LLMs versus fine-tuned LLMs
By the end of this chapter, you will have the fundamental knowledge of what LLMs are, how they work, and how you can make them more tailored to your applications. This will also pave the way for the concrete usage of LLMs in the hands-on part of this book, where we will see in practice how to embed LLMs within your applications.