You're reading from Building AI Intensive Python Applications Create intelligent apps with LLMs and vector databases

Product type Paperback

Published in Sep 2024

Publisher Packt

ISBN-13 9781836207252

Length 298 pages

Edition 1st Edition

Languages

Python

Tools

MongoDB

Concepts

Artificial Intelligence

Table of Contents (18) Chapters

Preface

1. Chapter 1: Getting Started with Generative AI

2. Chapter 2: Building Blocks of Intelligent Applications FREE CHAPTER

3. Part 1: Foundations of AI: LLMs, Embedding Models, Vector Databases, and Application Design

4. Chapter 3: Large Language Models

5. Chapter 4: Embedding Models

6. Chapter 5: Vector Databases

7. Chapter 6: AI/ML Application Design

8. Part 2: Building Your Python Application: Frameworks, Libraries, APIs, and Vector Search

9. Chapter 7: Useful Frameworks, Libraries, and APIs

10. Chapter 8: Implementing Vector Search in AI Applications

11. Part 3: Optimizing AI Applications: Scaling, Fine-Tuning, Troubleshooting, Monitoring, and Analytics

12. Chapter 9: LLM Output Evaluation

13. Chapter 10: Refining the Semantic Data Model to Improve Accuracy

14. Chapter 11: Common Failures of Generative AI

15. Chapter 12: Correcting and Optimizing Your Generative AI Application

16. Other Books You May Enjoy

Appendix: Further Reading: Index

Why subscribe?

Embedding Models

Embedding models are powerful machine learning techniques that simplify high-dimensional data into lower-dimensional space, while preserving essential features. Crucial in natural language processing (NLP), they transform sparse word representations into dense vectors, capturing semantic similarities between words. Embedding models also process images, audio, video, and structured data, enhancing applications in recommendation systems, anomaly detection, and clustering.

Here is an example of an embedding model in action. Suppose the full plot in a database of movies has been previously embedded using OpenAI’s text-embedding-ada-002 embedding model. Your goal is to find all movies and animations for Guardians of the Galaxy, but not by traditional phonetic or lexical matching (where you would type some of the words in the title). Instead, you will search by semantic means, say, the phrase Awkward team of space defenders. You will then use the same embedding model again to embed this phrase and query the embedded movie plots. Table 4.1 shows an excerpt of the resulting embedding:

Dimension	Value
1	0.00262913
2	0.031449784
3	0.0020321296
...	...
1535	-0.01821267
1536	0.0014683881

Table 4.1: Excerpt of embedding

This chapter will help you understand embedding models in depth. You’ll also implement an example using the Python language and the langchain-openai library.

This chapter will cover the following topics: