Code lab 3.1 – Adding sources to your RAG
Many of the aforementioned applications mentioned include an element of adding more data to the response. For example, you are likely going to want to quote the sources of your response if you have a RAG pipeline that crawls legal documents or scientific research papers as a part of the efforts described in the Expanding and enhancing private data with general external knowledge bases or Innovation scouting and trend analysis sections.
We will continue the code from Chapter 2 and add this valuable step of returning the retrieved documents in the RAG response.
Starting with the code from Chapter 2, we need to introduce these elements to this code, which I will walk through and explain what is happening:
from langchain_core.runnables import RunnableParallel
This is a new import: the RunnableParallel
object from LangChain runnables. This introduces the concept of running the retriever and question in parallel. This can improve performance by allowing the retriever to fetch the context while the question is being processed simultaneously:
rag_chain_from_docs = ( RunnablePassthrough.assign(context=( lambda x: format_docs(x["context"]))) | prompt | llm | StrOutputParser() ) rag_chain_with_source = RunnableParallel( {"context": retriever, "question": RunnablePassthrough()} ).assign(answer=rag_chain_from_docs)
Compare this to our original rag_chain
object:
rag_chain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser() )
In the original code, rag_chain
is constructed using a dictionary that combines the retriever and format_docs
function for "context"
, and RunnablePassthrough()
for "question"
. This dictionary is then piped (|
) through prompt
, llm
, and StrOutputParser()
.
In this new version, called rag_chain_from_docs
, the construction of rag_chain
is split into two parts:
- The
rag_chain_from_docs
chain is created usingRunnablePassthrough.assign()
to format the documents retrieved from the context. It then pipes the formatted context throughprompt
,llm
, andStrOutputParser()
. - The
rag_chain_with_source
chain is created usingRunnableParallel()
to run the retriever andRunnablePassthrough()
in parallel for"context"
and"question"
, respectively. The result is then assigned to"answer"
usingrag_chain_from_docs
.
The main difference in functionality between these two approaches is that the new approach separates the retrieval of the context from the formatting and processing of the retrieved documents. This allows for more flexibility in handling the retrieved context before passing it through the prompt, LLM, and output parser.
Finally, we have to change the name of the chain that we pass the user query to match the new chain name, rag_chain_with_source
. As we did in the past, we call the invoke method, rag_chain_with_source.invoke()
, passing it the question, which triggers the parallel execution of the retriever and question, followed by the formatting and processing of the retrieved context using rag_chain_from_docs
to generate the final answer:
rag_chain_with_source.invoke( "What are the Advantages of using RAG?")
The output will look like this (I shorted some of the text to fit into this book, but you should see the full printout when running this in code!):
{'context': [Document(page_content='Can you imagine what you could do with all of the benefits mentioned above…', metadata={'source': 'https://kbourne.github.io/chapter1.html'}), Document(page_content='Maintaining this integration over time, especially as data sources evolve or expand…', metadata={'source': 'https://kbourne.github.io/chapter1.html'}),…}
This is more code-looking than our previous final output, but it contains all of the information you would provide back to the user to indicate the source of the response you provided them. For many use cases, this sourcing of the material is very important in helping the user understand why the response was what it was, in fact-checking it, and in building off it if they have anything else they need to add.
Note the metadata source listed after each page_content
instance, which is what you would provide as the source link. In situations where you have multiple documents in the results, this could be different across each individual document returned in the retrieval step, but we only use one document here.