The architecture of RAG systems
The following are the stages of a RAG process from a user’s perspective:
- A user enters a query/question.
- The application thinks for a little while before checking the data it has access to so that it can see what is the most relevant.
- The application provides a response that focuses on answering the user’s question, but using data that has been provided to it through the RAG pipeline.
From a technical standpoint, this captures two of the stages you will code: the retrieval and generation stages. But there is one other stage, known as indexing, which can be and is often executed before the user enters the query. With indexing, you are turning supporting data into vectors, storing them in a vector database, and likely optimizing the search functionality so that the retrieval step is as fast and effective as possible.
Once the user passes their query into the system, the following steps occur:
- The user query...