The retrieval-augmented generation pattern
Foundation models are frozen in time and limited to the knowledge they were trained on, lacking access to an organization’s private data or changing public domain information. To enhance the accuracy of responses, especially when using proprietary or up-to-date data, we require a mechanism to integrate external information into the model’s response generation process.
This is where retrieval-augmented generation (RAG) can step in. RAG is a new architecture pattern introduced to support generative AI-based solutions such as enterprise knowledge search and document question answering where external data sources are required. There are two main stages to RAG:
- The indexing stage for preparing a knowledge base with data ingestion and indexes.
- The query stage for retrieving relevant context from the knowledge base and passing it to the LLM to generate a response.
Architecturally, RAG architecture consists...