You're reading from LLM Engineer's Handbook Master the art of engineering large language models from concept to production

Product type Paperback

Published in Oct 2024

Publisher Packt

ISBN-13 9781836200079

Length 522 pages

Edition 1st Edition

Languages

Python

Tools

AWS

Concepts

Artificial Intelligence

Authors (3):

Maxime Labonne

Paul Iusztin

Alex Vesa

View More author details

Table of Contents (15) Chapters

Preface

1. Understanding the LLM Twin Concept and Architecture FREE CHAPTER

2. Tooling and Installation

3. Data Engineering

4. RAG Feature Pipeline

5. Supervised Fine-Tuning

6. Fine-Tuning with Preference Alignment

7. Evaluating LLMs

8. Inference Optimization

9. RAG Inference Pipeline

10. Inference Pipeline Deployment

11. MLOps and LLMOps

12. Other Books You May Enjoy

13. Index

Appendix: MLOps Principles

Databases for storing unstructured and vector data

We also want to present the NoSQL and vector databases we will use within our examples. When working locally, they are already integrated through Docker. Thus, when running poetry poe local-infrastructure-up, as instructed a few sections above, local images of Docker for both databases will be pulled and run on your machine. Also, when deploying the project, we will show you how to use their serverless option and integrate it with the rest of the LLM Twin project.

MongoDB: NoSQL database

MongoDB is one of today’s most popular, robust, fast, and feature-rich NoSQL databases. It integrates well with most cloud ecosystems, such as AWS, Google Cloud, Azure, and Databricks. Thus, using MongoDB as our NoSQL database was a no-brainer.

When we wrote this book, MongoDB was used by big players such as Novo Nordisk, Delivery Hero, Okta, and Volvo. This widespread adoption suggests that MongoDB will remain a leading NoSQL database for a long time.

We use MongoDB as a NoSQL database to store the raw data we collect from the internet before processing it and pushing it into the vector database. As we work with unstructured text data, the flexibility of the NoSQL database fits like a charm.

Qdrant: vector database

Qdrant (https://qdrant.tech/) is one of the most popular, robust, and feature-rich vector databases. We could have used almost any vector database for our small MVP, but we wanted to pick something light and likely to be used in the industry for many years to come.

We will use Qdrant to store the data from MongoDB after it’s processed and transformed for GenAI usability.

Qdrant is used by big players such as X (formerly Twitter), Disney, Microsoft, Discord, and Johnson & Johnson. Thus, it is highly probable that Qdrant will remain in the vector database game for a long time.

While writing the book, other popular options were Milvus, Redis, Weaviate, Pinecone, Chroma, and pgvector (a PostgreSQL plugin for vector indexes). We found that Qdrant offers the best trade-off between RPS, latency, and index time, making it a solid choice for many generative AI applications.

Comparing all the vector databases in detail could be a chapter in itself. We don’t want to do that here. Still, if curious, you can check the Vector DB Comparison resource from Superlinked at https://superlinked.com/vector-db-comparison, which compares all the top vector databases in terms of everything you can think about, from the license and release year to database features, embedding models, and frameworks supported.