Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Generative AI Application Integration Patterns

You're reading from   Generative AI Application Integration Patterns Integrate large language models into your applications

Arrow left icon
Product type Paperback
Published in Sep 2024
Publisher Packt
ISBN-13 9781835887608
Length 218 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Luis Lopez Soria Luis Lopez Soria
Author Profile Icon Luis Lopez Soria
Luis Lopez Soria
Juan Pablo Bustos Juan Pablo Bustos
Author Profile Icon Juan Pablo Bustos
Juan Pablo Bustos
Arrow right icon
View More author details
Toc

Table of Contents (13) Chapters Close

Preface 1. Introduction to Generative AI Patterns 2. Identifying Generative AI Use Cases FREE CHAPTER 3. Designing Patterns for Interacting with Generative AI 4. Generative AI Batch and Real-Time Integration Patterns 5. Integration Pattern: Batch Metadata Extraction 6. Integration Pattern: Batch Summarization 7. Integration Pattern: Real-Time Intent Classification 8. Integration Pattern: Real-Time Retrieval Augmented Generation 9. Operationalizing Generative AI Integration Patterns 10. Embedding Responsible AI into Your GenAI Applications 11. Other Books You May Enjoy
12. Index

Summary

In this chapter, you’ve explored an integration pattern that combines RAG and generative AI models to build a chatbot capable of answering questions based on a document corpus. You’ve learned that RAG leverages the strengths of retrieval systems and generative models, allowing the system to retrieve relevant context from existing knowledge sources and generate contextual responses, preventing hallucinations and ensuring accuracy.

We proposed an architecture that utilized a serverless, event-driven approach built on Google Cloud. It consists of an ingestion layer for accepting user queries, a document corpus management layer for storing embeddings, an AI processing layer integrating with Google Gemini on Vertex AI, and monitoring and logging components. The entry point handles various input modalities like text, audio, and images, pre-processing them as needed.

You’ve learned that the core of the RAG pipeline involves generating embeddings from...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €18.99/month. Cancel anytime