You're reading from Unlocking Data with Generative AI and RAG Enhance generative AI systems by integrating internal data with large language models using RAG

Product type Paperback

Published in Sep 2024

Publisher Packt

ISBN-13 9781835887905

Length 346 pages

Edition 1st Edition

Concepts

GPT/LLMs

Author (1):

Keith Bourne

View More author details

Table of Contents (20) Chapters

Preface

1. Part 1 – Introduction to Retrieval-Augmented Generation (RAG)

2. Chapter 1: What Is Retrieval-Augmented Generation (RAG) FREE CHAPTER

3. Chapter 2: Code Lab – An Entire RAG Pipeline

4. Chapter 3: Practical Applications of RAG

5. Chapter 4: Components of a RAG System

6. Chapter 5: Managing Security in RAG Applications

7. Part 2 – Components of RAG

8. Chapter 6: Interfacing with RAG and Gradio

9. Chapter 7: The Key Role Vectors and Vector Stores Play in RAG

10. Chapter 8: Similarity Searching with Vectors

11. Chapter 9: Evaluating RAG Quantitatively and with Visualizations

12. Chapter 10: Key RAG Components in LangChain

13. Chapter 11: Using LangChain to Get More from RAG

14. Part 3 – Implementing Advanced RAG

15. Chapter 12: Combining RAG with the Power of AI Agents and LangGraph

16. Chapter 13: Using Prompt Engineering to Improve RAG Efforts

17. Chapter 14: Advanced RAG-Related Techniques for Improving Results

18. Index

Why subscribe?

19. Other Books You May Enjoy

Code lab 3.1 – Adding sources to your RAG

Many of the aforementioned applications mentioned include an element of adding more data to the response. For example, you are likely going to want to quote the sources of your response if you have a RAG pipeline that crawls legal documents or scientific research papers as a part of the efforts described in the Expanding and enhancing private data with general external knowledge bases or Innovation scouting and trend analysis sections.

We will continue the code from Chapter 2 and add this valuable step of returning the retrieved documents in the RAG response.

Starting with the code from Chapter 2, we need to introduce these elements to this code, which I will walk through and explain what is happening:

from langchain_core.runnables import RunnableParallel

This is a new import: the RunnableParallel object from LangChain runnables. This introduces the concept of running the retriever and question in parallel. This can improve performance by allowing the retriever to fetch the context while the question is being processed simultaneously:

rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(
        lambda x: format_docs(x["context"])))
    | prompt
    | llm
    | StrOutputParser()
)
rag_chain_with_source = RunnableParallel(
    {"context": retriever,
     "question": RunnablePassthrough()}
).assign(answer=rag_chain_from_docs)

Compare this to our original rag_chain object:

rag_chain = (
    {"context": retriever | format_docs,
     "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In the original code, rag_chain is constructed using a dictionary that combines the retriever and format_docs function for "context", and RunnablePassthrough() for "question". This dictionary is then piped (|) through prompt, llm, and StrOutputParser().

In this new version, called rag_chain_from_docs, the construction of rag_chain is split into two parts:

The rag_chain_from_docs chain is created using RunnablePassthrough.assign() to format the documents retrieved from the context. It then pipes the formatted context through prompt, llm, and StrOutputParser().
The rag_chain_with_source chain is created using RunnableParallel() to run the retriever and RunnablePassthrough() in parallel for "context" and "question", respectively. The result is then assigned to "answer" using rag_chain_from_docs.

The main difference in functionality between these two approaches is that the new approach separates the retrieval of the context from the formatting and processing of the retrieved documents. This allows for more flexibility in handling the retrieved context before passing it through the prompt, LLM, and output parser.

Finally, we have to change the name of the chain that we pass the user query to match the new chain name, rag_chain_with_source. As we did in the past, we call the invoke method, rag_chain_with_source.invoke(), passing it the question, which triggers the parallel execution of the retriever and question, followed by the formatting and processing of the retrieved context using rag_chain_from_docs to generate the final answer:

rag_chain_with_source.invoke(
    "What are the Advantages of using RAG?")

The output will look like this (I shorted some of the text to fit into this book, but you should see the full printout when running this in code!):

{'context': [Document(page_content='Can you imagine what you could do with all of the benefits mentioned above…', metadata={'source': 'https://kbourne.github.io/chapter1.html'}),
 Document(page_content='Maintaining this integration over time, especially as data sources evolve or expand…', metadata={'source': 'https://kbourne.github.io/chapter1.html'}),…}

This is more code-looking than our previous final output, but it contains all of the information you would provide back to the user to indicate the source of the response you provided them. For many use cases, this sourcing of the material is very important in helping the user understand why the response was what it was, in fact-checking it, and in building off it if they have anything else they need to add.

Note the metadata source listed after each page_content instance, which is what you would provide as the source link. In situations where you have multiple documents in the results, this could be different across each individual document returned in the retrieval step, but we only use one document here.

The rest of the chapter is locked

You're reading from Unlocking Data with Generative AI and RAG Enhance generative AI systems by integrating internal data with large language models using RAG

Table of Contents (20) Chapters

Code lab 3.1 – Adding sources to your RAG

Authors (1)

Personalised recommendations for you

You're reading from Unlocking Data with Generative AI and RAG Enhance generative AI systems by integrating internal data with large language models using RAG

Table of Contents (20) Chapters

Code lab 3.1 – Adding sources to your RAG

Unlock this book and the full library FREE for 7 days

Authors (1)

Personalised recommendations for you