Indexing
The first stage in the RAG system we will examine more closely is indexing. Note that we are skipping the setup, where we install and import packages, as well as set up OpenAI and related accounts. That is a typical step in every generative artificial intelligence (AI) project, not just RAG systems. We provided a thorough setup guide in Chapter 2, so jump back there if you want to review the libraries we’ve added to support these next steps.
Indexing occurs as the first main stage of RAG. As Figure 4.2 indicates, it is the step after the user query:
Figure 4.2 – The Indexing stage of RAG highlighted
In our code from Chapter 2, Indexing is the first section of code you see. This is the step where the data you are introducing to the RAG system is processed. As you can see in the code, the data in this scenario is the web document that is being loaded by WebBaseLoader
. This is the beginning of that document (Figure 4.3):