Code lab 12.1 – adding a LangGraph agent to RAG
In this code lab, we will add an agent to our existing RAG pipeline that can make decisions about whether to retrieve from an index or use a web search. We will show the inner thoughts of the agent as it processes data that it retrieves toward the goal of providing you with a more thorough response to your question. As we add the code for our agent, we will see new components, such as tools, toolkits, graphs, nodes, edges, and, of course, the agent itself. For each component, we will go more in-depth into how that component interacts and supports your RAG application. We will also add code so that this functions more like a chat session, rather than a Q&A session:
- First, we will install some new packages to support our agent development:
%pip install tiktoken
%pip install langgraph
In the first line, we install the
tiktoken
package, which is an OpenAI package used for tokenizing text data before feeding it into language models. Last, we pull in thelanggraph
package we have been discussing. - Next, we add a new LLM definition and update our existing one:
llm = ChatOpenAI(model_name="gpt-4o-mini",
temperature=0, streaming=True)
agent_llm = ChatOpenAI(model_name="gpt-4o-mini",
temperature=0, streaming=True)
The new agent_llm
LLM instance will serve as our agent’s brain, handling reasoning and execution of the agent tasks, whereas the original llm
instance will still be present in our general LLM to do the same LLM tasks we have used it for in the past. While the two LLMs are defined with the same model and parameters in our example, you could and should experiment with using different LLMs for these different tasks, to see if there is a combination that works better for your RAG applications. You could even add additional LLMs to handle specific tasks, such as the improve
or score_documents
functions in this code, if you find an LLM better at those tasks or have trained or fine-tuned your own for these particular actions. For example, It is common for simple tasks to be handled by faster, lower-cost LLMs as long as they can perform the task successfully. There is a lot of flexibility built into this code that you can take advantage of! Also, note that we add streaming=True
to the LLM definition. This turns on streaming data from the LLM, which is more conducive to an agent that may make several calls, sometimes in parallel, constantly interacting with the LLM.
Now, we are going to skip down to after the retriever definitions (dense_retriever
, sparse_retriever
, and ensemble_retriever
) and add our first tool. A tool has a very specific and important meaning when it comes to agents; so, let’s talk about that now.
Tools and toolkits
In the following code, we are going to add a web search tool:
from langchain_community.tools.tavily_search import TavilySearchResults _ = load_dotenv(dotenv_path='env.txt') os.environ['TAVILY_API_KEY'] = os.getenv('TAVILY_API_KEY') !export TAVILY_API_KEY=os.environ['TAVILY_API_KEY'] web_search = TavilySearchResults(max_results=4) web_search_name = web_search.name
You will need to get another API key and add it to the env.txt
file we have used in the past for the OpenAI and Together APIs. Just like with those APIs, you will need to go to that website, set up your API key, and then copy that into your env.txt
file. The Tavily website can be found at this URL: https://tavily.com/
We run the code again that loads the data from the env.txt
file and then we set up the TavilySearchResults
object with max_results
of 4
, meaning when we run it for search, we only want four search results maximum. We then assign the web_search.name
variable to a variable called web_search_name
so that we have that available later when we want to tell the agent about it. You can run this tool directly using this code:
web_search.invoke(user_query)
Running this tool code with user_query
will give you a result like this (truncated for brevity):
[{'url': 'http://sustainability.google/', 'content': "Google Maps\nChoose the most fuel-efficient route\nGoogle Shopping\nShop for more efficient appliances for your home\nGoogle Flights\nFind a flight with lower per-traveler carbon emissions\nGoogle Nest\...[TRUNCATED HERE]"}, … 'content': "2023 Environmental Report. Google's 2023 Environmental Report outlines how we're driving positive environmental outcomes throughout our business in three key ways: developing products and technology that empower individuals on their journey to a more sustainable life, working together with partners and organizations everywhere to transition to resilient, low-carbon systems, and operating ..."}]
We truncated this so we take up less space in the book, but try this in the code and you will see four results, as we asked for, and they all seem to be highly related to the topic user_query
is asking about. Note that you will not need to run this tool directly in your code like we just did.
At this point, you have just established your first agent tool! This is a search engine tool that your agent can use to retrieve more information from the internet to help it achieve its goal of answering the question your user poses to it.
The tool concept in LangChain and when building agents comes from the idea that you want to make actions available to your agent so that it can carry out its tasks. Tools are the mechanism that allows this to happen. You define a tool like we just did for the web search, and then you later add it to a list of tools that the agent can use to accomplish its tasks. Before we set up that list though, we want to create another tool that is central for a RAG application: a retriever tool:
from langchain.tools.retriever import create_retriever_tool retriever_tool = create_retriever_tool( ensemble_retriever, "retrieve_google_environmental_question_answers", "Extensive information about Google environmental efforts from 2023.", ) retriever_tool_name = retriever_tool.name
Note that with the web search tool, we imported it from langchain_community.tools.tavily_search
, whereas with this tool, we use langchain.tools.retriever
. This reflects the fact that Tavily is a third-party tool, whereas the retriever tool we create here is part of the core LangChain functionality. After importing the create_retriever_tool
function, we use it to create the retriever_tool
tool for our agent. Again, like with web_search_name
, we pull out the retriever_tool.name
variable we can reference later when we want to refer to it for the agent. You may notice the name of the actual retriever this tool will use, the ensemble_retriever
retriever, which we created in Chapter 8’s 8.3 code lab!
You should also note that the name that we are giving this tool, as far as the agent is concerned, is found in the second field, and we are calling it retrieve_google_environmental_question_answers
. When we name variables in code, we normally try to keep them smaller, but for tools that agents will use, it is helpful to provide more verbose names that will help the agent understand what can be used fully.
We now have two tools for our agent! However, we still need to tell the agent about them eventually; so, we package them up into a list that we can later share with the agent:
tools = [web_search, retriever_tool]
You see here the two tools we created previously, the web_search
tool and the retriever_tool
tool, getting added to the tools list. If we had other tools we wanted to make available to the agent, we could add those to the list as well. In the LangChain ecosystem, there are hundreds of tools available: https://python.langchain.com/v0.2/docs/integrations/tools/
You will want to make sure the LLM you are using is “good” at reasoning and using tools. In general, chat models tend to have been fine-tuned for tool calling and will be better at using tools. Non-chat-fine-tuned models may not be able to use tools, especially if the tools are complex or require multiple calls. Using well-written names and descriptions can play an important role in setting your agent LLM up for success as well.
In the agent we are building, we have all the tools we need, but you will also want to look at the toolkits, which are convenient groups of tools. LangChain provides a list of the current toolkits available on their website: https://python.langchain.com/v0.2/docs/integrations/toolkits/
For example, if you have a data infrastructure that uses pandas DataFrames, you could use the pandas DataFrame toolkit to offer your agent various tools to access those DataFrames in different ways. Drawing straight from the LangChain website, toolkits are described as follows: (https://python.langchain.com/v0.1/docs/modules/agents/concepts/#toolkits)
So, basically, if you are focusing on a set of common tasks for your agent or a popular integration partner with LangChain (such as a Salesforce integration), there is likely a toolkit that will give you access to all the tools you need all at once.
Now that we have the tools established, let’s start building the components of our agent, starting with the agent state.
Agent state
The agent state is a key component of any agent you build with LangGraph. Using LangGraph, you create an AgentState
class that establishes the “state” for your agent and tracks it over time. This state is a local mechanism to the agent that you make available to all parts of the graph and can be stored in a persistence layer.
Here, we set up this state for our RAG agent:
from typing import Annotated, Literal, Sequence, TypedDict from langchain_core.messages import BaseMessage from langgraph.graph.message import add_messages class AgentState(TypedDict): messages: Annotated[Sequence[BaseMessage], add_messages]
This imports relevant packages for setting up AgentState
. For example, BaseMessage
is a base class for representing messages in the conversation between the user and the AI agent. It will be used to define the structure and properties of messages in the state of the conversation. It then defines a graph and a "state"
object that it passes around to each node. You can set the state to be a variety of types of objects that you can store different types of data, but for our RAG agent, we set up our state to be a list of "messages"
.
We then need to import another round of packages to set up other parts of our agent:
from langchain_core.messages import HumanMessage from langchain_core.pydantic_v1 import BaseModel, Field from langgraph.prebuilt import tools_condition
In this code, we start with importing HumanMessage
. HumanMessage
is a specific type of message that represents a message sent by the human user. It will used when constructing the prompt for the agent to generate a response. We also import BaseModel
and Field
. BaseModel
is a class from the Pydantic
library that is used to define data models and validate data. Field
is a class from Pydantic
that is used to define the properties and validation rules for fields in a data model. Last, we import tools_condition
. The tools_condition
function is a pre-built function provided by the LangGraph
library. It is used to assess the agent’s decision on whether to use specific tools based on the current state of the conversation.
These imported classes and functions are used throughout the code to define the structure of messages, validate data, and control the flow of the conversation based on the agent’s decisions. They provide the necessary building blocks and utilities for constructing the language model application using the LangGraph
library.
We then define our primary prompt (representing what the user would input) like this:
generation_prompt = PromptTemplate.from_template( """You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Provide a thorough description to fully answer the question, utilizing any relevant information you find. Question: {question} Context: {context} Answer:""" )
This is a replacement for the code that we were using in the past code labs:
prompt = hub.pull("jclemens24/rag-prompt")
We alter the name to generation_prompt
to make this prompt’s use more clear.
Our graph usage is about to pick up in our code, but first, we need to cover some basic graph theory concepts.