Optimizing retrieval-augmented generation
Beyond optimizing the semantic data model itself through vector embedding model choice and metadata enrichment, there are ways to further refine and improve RAG applications. This section covers strategies for optimizing different components and stages of the RAG pipeline.
Key areas of optimization include query handling, formatting of ingested data, retrieval system configuration, and application-level guardrails. Effectively optimizing these aspects can lead to significant boosts in the accuracy, relevance, and overall performance of RAG applications.
Note
This section covers more advanced techniques than the ones discussed in Chapter 8, Implementing Vector Search in AI Applications.
Query mutation
In the naive RAG approach, you use direct user input to create the embedding used in vector search, perhaps augmented with metadata as discussed earlier in the chapter. However, you can drive better search performance by mutating...