Fundamentals of AI agents and RAG integration
When talking with new developers in generative AI, we have been told that the concept of an AI agent often tends to be one of the more challenging concepts to grasp. When experts talk about agents, they often talk about them in very abstract terms, focusing on all the things AI agents can be responsible for in a RAG application, but failing to really explain thoroughly what an AI agent is and how it works.
I find that it is easiest to dispel the mystery of the AI agent by explaining what it really is, which is actually a very simple concept. To build an AI agent in its most basic form, you are simply taking the same LLM concept you have already been working with throughout these chapters and adding a loop that terminates when the intended task is done. That’s it! It’s just a loop folks!
Figure 12.1 represents the RAG agent loop you will be working with in the code lab that you are about to dive into:
Figure 12.1 – Graph of the agent’s control flow
This represents a relatively simple set of logic steps that loop through until the agent decides it has successfully completed the task you have given it. The oval boxes, such as agent and retrieve, are called nodes and the lines are called edges. The dotted lines are also edges, but they are a specific type called conditional edges, which are edges that are also decision points.
Despite the simplicity, the concept of adding a loop to your LLM calls does make it much more powerful than just using LLMs directly, because it takes more advantage of the LLM’s ability to reason and break tasks down into simpler tasks. This improves the chances of success in whatever task you are pursuing and will come in especially handy with more complex multi-step RAG tasks.
While your LLM is looping through agent tasks, you also provide functions called tools to the agent, and the LLM will use its reasoning capabilities to determine which tool to use, how to use that tool, and what data to feed it. This is where it can get really complex very quickly. You can have multiple agents, numerous tools, integrated knowledge graphs that help guide your agents down a specific path, numerous frameworks that offer different flavors of agents, numerous approaches to agent architecture, and much more. But in this chapter, we are going to focus specifically on how an AI agent can help improve RAG applications. Once you see the power of using an AI agent though, I have no doubt you will want to use it in other generative AI applications, and you should!
Living in an AI agent world
With all the excitement around agents, you might think LLMs are already going obsolete. But that couldn’t be further from the truth. With AI agents, you are really tapping into an even more powerful version of an LLM, a version where the LLM serves as the “brain” of the agent, letting it reason and come up with multi-step solutions well beyond the one-off chat questions most people are using them for. The agent just provides a layer between the user and the LLM and pushes the LLM to accomplish a task that may take multiple queries of the LLM but will eventually, in theory, end up with a much better result.
If you think about it, this matches up more with how problems are solved in the real world, where even simple decisions can be complex. Most tasks we do are based on a long chain of observations, reasoning, and adjustments to new experiences. Very rarely do we interact with people, tasks, and things in the real world in the same way we interact with LLMs online. There is often this building of understanding, knowledge, and context that takes place and helps us find the best solutions. AI agents are better able to handle this type of approach to problem-solving.
Agents can make a big difference to your RAG efforts, but what about this concept of the LLMs being their brains? Let’s dive into the concept further.
LLMs as the agents’ brains
If you consider the LLM as the brain of your AI agent, the next logical step is that you likely want the smartest LLM you can find to be that brain. The capabilities of the LLM are going to affect your AI agent’s ability to reason and make decisions, which will certainly impact the results of the queries to your RAG application.
There is one major way this metaphor of an LLM brain breaks down though, but in a very good way. Unlike agents in the real world, the AI agent can always swap out their LLM brain for another LLM brain. We could even give it multiple LLM brains that can serve to check each other and make sure things are proceeding as planned. This gives us greater flexibility that will help us continually improve the capabilities of our agents.
So, how does LangGraph, or graphs in general, relate to AI agents? We will discuss that next.