Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Unlocking Data with Generative AI and RAG

You're reading from   Unlocking Data with Generative AI and RAG Enhance generative AI systems by integrating internal data with large language models using RAG

Arrow left icon
Product type Paperback
Published in Sep 2024
Publisher Packt
ISBN-13 9781835887905
Length 346 pages
Edition 1st Edition
Concepts
Arrow right icon
Author (1):
Arrow left icon
Keith Bourne Keith Bourne
Author Profile Icon Keith Bourne
Keith Bourne
Arrow right icon
View More author details
Toc

Table of Contents (20) Chapters Close

Preface 1. Part 1 – Introduction to Retrieval-Augmented Generation (RAG)
2. Chapter 1: What Is Retrieval-Augmented Generation (RAG) FREE CHAPTER 3. Chapter 2: Code Lab – An Entire RAG Pipeline 4. Chapter 3: Practical Applications of RAG 5. Chapter 4: Components of a RAG System 6. Chapter 5: Managing Security in RAG Applications 7. Part 2 – Components of RAG
8. Chapter 6: Interfacing with RAG and Gradio 9. Chapter 7: The Key Role Vectors and Vector Stores Play in RAG 10. Chapter 8: Similarity Searching with Vectors 11. Chapter 9: Evaluating RAG Quantitatively and with Visualizations 12. Chapter 10: Key RAG Components in LangChain 13. Chapter 11: Using LangChain to Get More from RAG 14. Part 3 – Implementing Advanced RAG
15. Chapter 12: Combining RAG with the Power of AI Agents and LangGraph 16. Chapter 13: Using Prompt Engineering to Improve RAG Efforts 17. Chapter 14: Advanced RAG-Related Techniques for Improving Results 18. Index 19. Other Books You May Enjoy

Code lab 12.1 – adding a LangGraph agent to RAG

In this code lab, we will add an agent to our existing RAG pipeline that can make decisions about whether to retrieve from an index or use a web search. We will show the inner thoughts of the agent as it processes data that it retrieves toward the goal of providing you with a more thorough response to your question. As we add the code for our agent, we will see new components, such as tools, toolkits, graphs, nodes, edges, and, of course, the agent itself. For each component, we will go more in-depth into how that component interacts and supports your RAG application. We will also add code so that this functions more like a chat session, rather than a Q&A session:

  1. First, we will install some new packages to support our agent development:
    %pip install tiktoken
    %pip install langgraph

    In the first line, we install the tiktoken package, which is an OpenAI package used for tokenizing text data before feeding it into language models. Last, we pull in the langgraph package we have been discussing.

  2. Next, we add a new LLM definition and update our existing one:
    llm = ChatOpenAI(model_name="gpt-4o-mini",
        temperature=0, streaming=True)
    agent_llm = ChatOpenAI(model_name="gpt-4o-mini",
        temperature=0, streaming=True)

The new agent_llm LLM instance will serve as our agent’s brain, handling reasoning and execution of the agent tasks, whereas the original llm instance will still be present in our general LLM to do the same LLM tasks we have used it for in the past. While the two LLMs are defined with the same model and parameters in our example, you could and should experiment with using different LLMs for these different tasks, to see if there is a combination that works better for your RAG applications. You could even add additional LLMs to handle specific tasks, such as the improve or score_documents functions in this code, if you find an LLM better at those tasks or have trained or fine-tuned your own for these particular actions. For example, It is common for simple tasks to be handled by faster, lower-cost LLMs as long as they can perform the task successfully. There is a lot of flexibility built into this code that you can take advantage of! Also, note that we add streaming=True to the LLM definition. This turns on streaming data from the LLM, which is more conducive to an agent that may make several calls, sometimes in parallel, constantly interacting with the LLM.

Now, we are going to skip down to after the retriever definitions (dense_retriever, sparse_retriever, and ensemble_retriever) and add our first tool. A tool has a very specific and important meaning when it comes to agents; so, let’s talk about that now.

Tools and toolkits

In the following code, we are going to add a web search tool:

from langchain_community.tools.tavily_search import TavilySearchResults
_ = load_dotenv(dotenv_path='env.txt')
os.environ['TAVILY_API_KEY'] = os.getenv('TAVILY_API_KEY')
!export TAVILY_API_KEY=os.environ['TAVILY_API_KEY']
web_search = TavilySearchResults(max_results=4)
web_search_name = web_search.name

You will need to get another API key and add it to the env.txt file we have used in the past for the OpenAI and Together APIs. Just like with those APIs, you will need to go to that website, set up your API key, and then copy that into your env.txt file. The Tavily website can be found at this URL: https://tavily.com/

We run the code again that loads the data from the env.txt file and then we set up the TavilySearchResults object with max_results of 4, meaning when we run it for search, we only want four search results maximum. We then assign the web_search.name variable to a variable called web_search_name so that we have that available later when we want to tell the agent about it. You can run this tool directly using this code:

web_search.invoke(user_query)

Running this tool code with user_query will give you a result like this (truncated for brevity):

[{'url': 'http://sustainability.google/',
  'content': "Google Maps\nChoose the most fuel-efficient route\nGoogle Shopping\nShop for more efficient appliances for your home\nGoogle Flights\nFind a flight with lower per-traveler carbon emissions\nGoogle Nest\...[TRUNCATED HERE]"},
…
  'content': "2023 Environmental Report. Google's 2023 Environmental Report outlines how we're driving positive environmental outcomes throughout our business in three key ways: developing products and technology that empower individuals on their journey to a more sustainable life, working together with partners and organizations everywhere to transition to resilient, low-carbon systems, and operating ..."}]

We truncated this so we take up less space in the book, but try this in the code and you will see four results, as we asked for, and they all seem to be highly related to the topic user_query is asking about. Note that you will not need to run this tool directly in your code like we just did.

At this point, you have just established your first agent tool! This is a search engine tool that your agent can use to retrieve more information from the internet to help it achieve its goal of answering the question your user poses to it.

The tool concept in LangChain and when building agents comes from the idea that you want to make actions available to your agent so that it can carry out its tasks. Tools are the mechanism that allows this to happen. You define a tool like we just did for the web search, and then you later add it to a list of tools that the agent can use to accomplish its tasks. Before we set up that list though, we want to create another tool that is central for a RAG application: a retriever tool:

from langchain.tools.retriever import create_retriever_tool
retriever_tool = create_retriever_tool(
    ensemble_retriever,
    "retrieve_google_environmental_question_answers",
    "Extensive information about Google environmental
     efforts from 2023.",
)
retriever_tool_name = retriever_tool.name

Note that with the web search tool, we imported it from langchain_community.tools.tavily_search, whereas with this tool, we use langchain.tools.retriever. This reflects the fact that Tavily is a third-party tool, whereas the retriever tool we create here is part of the core LangChain functionality. After importing the create_retriever_tool function, we use it to create the retriever_tool tool for our agent. Again, like with web_search_name, we pull out the retriever_tool.name variable we can reference later when we want to refer to it for the agent. You may notice the name of the actual retriever this tool will use, the ensemble_retriever retriever, which we created in Chapter 8’s 8.3 code lab!

You should also note that the name that we are giving this tool, as far as the agent is concerned, is found in the second field, and we are calling it retrieve_google_environmental_question_answers. When we name variables in code, we normally try to keep them smaller, but for tools that agents will use, it is helpful to provide more verbose names that will help the agent understand what can be used fully.

We now have two tools for our agent! However, we still need to tell the agent about them eventually; so, we package them up into a list that we can later share with the agent:

tools = [web_search, retriever_tool]

You see here the two tools we created previously, the web_search tool and the retriever_tool tool, getting added to the tools list. If we had other tools we wanted to make available to the agent, we could add those to the list as well. In the LangChain ecosystem, there are hundreds of tools available: https://python.langchain.com/v0.2/docs/integrations/tools/

You will want to make sure the LLM you are using is “good” at reasoning and using tools. In general, chat models tend to have been fine-tuned for tool calling and will be better at using tools. Non-chat-fine-tuned models may not be able to use tools, especially if the tools are complex or require multiple calls. Using well-written names and descriptions can play an important role in setting your agent LLM up for success as well.

In the agent we are building, we have all the tools we need, but you will also want to look at the toolkits, which are convenient groups of tools. LangChain provides a list of the current toolkits available on their website: https://python.langchain.com/v0.2/docs/integrations/toolkits/

For example, if you have a data infrastructure that uses pandas DataFrames, you could use the pandas DataFrame toolkit to offer your agent various tools to access those DataFrames in different ways. Drawing straight from the LangChain website, toolkits are described as follows: (https://python.langchain.com/v0.1/docs/modules/agents/concepts/#toolkits)

For many common tasks, an agent will need a set of related tools. For this LangChain provides the concept of toolkits - groups of around 3-5 tools needed to accomplish specific objectives. For example, the GitHub toolkit has a tool for searching through GitHub issues, a tool for reading a file, a tool for commenting, etc.

So, basically, if you are focusing on a set of common tasks for your agent or a popular integration partner with LangChain (such as a Salesforce integration), there is likely a toolkit that will give you access to all the tools you need all at once.

Now that we have the tools established, let’s start building the components of our agent, starting with the agent state.

Agent state

The agent state is a key component of any agent you build with LangGraph. Using LangGraph, you create an AgentState class that establishes the “state” for your agent and tracks it over time. This state is a local mechanism to the agent that you make available to all parts of the graph and can be stored in a persistence layer.

Here, we set up this state for our RAG agent:

from typing import Annotated, Literal, Sequence, TypedDict
from langchain_core.messages import BaseMessage
from langgraph.graph.message import add_messages
class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage],
                        add_messages]

This imports relevant packages for setting up AgentState. For example, BaseMessage is a base class for representing messages in the conversation between the user and the AI agent. It will be used to define the structure and properties of messages in the state of the conversation. It then defines a graph and a "state" object that it passes around to each node. You can set the state to be a variety of types of objects that you can store different types of data, but for our RAG agent, we set up our state to be a list of "messages".

We then need to import another round of packages to set up other parts of our agent:

from langchain_core.messages import HumanMessage
from langchain_core.pydantic_v1 import BaseModel, Field
from langgraph.prebuilt import tools_condition

In this code, we start with importing HumanMessage. HumanMessage is a specific type of message that represents a message sent by the human user. It will used when constructing the prompt for the agent to generate a response. We also import BaseModel and Field. BaseModel is a class from the Pydantic library that is used to define data models and validate data. Field is a class from Pydantic that is used to define the properties and validation rules for fields in a data model. Last, we import tools_condition. The tools_condition function is a pre-built function provided by the LangGraph library. It is used to assess the agent’s decision on whether to use specific tools based on the current state of the conversation.

These imported classes and functions are used throughout the code to define the structure of messages, validate data, and control the flow of the conversation based on the agent’s decisions. They provide the necessary building blocks and utilities for constructing the language model application using the LangGraph library.

We then define our primary prompt (representing what the user would input) like this:

generation_prompt = PromptTemplate.from_template(
    """You are an assistant for question-answering tasks.
    Use the following pieces of retrieved context to answer
    the question. If you don't know the answer, just say
    that you don't know. Provide a thorough description to
    fully answer the question, utilizing any relevant
    information you find.
    Question: {question}
    Context: {context}
    Answer:"""
)

This is a replacement for the code that we were using in the past code labs:

prompt = hub.pull("jclemens24/rag-prompt")

We alter the name to generation_prompt to make this prompt’s use more clear.

Our graph usage is about to pick up in our code, but first, we need to cover some basic graph theory concepts.

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image