Naïve, advanced, and modular RAG configurations
A RAG framework necessarily contains two main components: a retriever and a generator. The generator can be any LLM or foundation multimodal AI platform or model, such as GPT-4o, Gemini, Llama, or one of the hundreds of variations of the initial architectures. The retriever can be any of the emerging frameworks, methods, and tools such as Activeloop, Pinecone, LlamaIndex, LangChain, Chroma, and many more.
The issue now is to decide which of the three types of RAG frameworks (Gao et al., 2024) will fit the needs of a project. We will illustrate these three approaches in code in the Naïve, advanced, and modular RAG in code section of this chapter:
- Naïve RAG: This type of RAG framework doesn’t involve complex data embedding and indexing. It can be efficient to access reasonable amounts of data through keywords, for example, to augment a user’s input and obtain a satisfactory response.
- Advanced RAG: This type of RAG involves more complex scenarios, such as with vector search and indexed-base retrieval applied. Advanced RAG can be implemented with a wide range of methods. It can process multiple data types, as well as multimodal data, which can be structured or unstructured.
- Modular RAG: Modular RAG broadens the horizon to include any scenario that involves naïve RAG, advanced RAG, machine learning, and any algorithm needed to complete a complex project.
However, before going further, we need to decide if we should implement RAG or fine-tune a model.