Getting started with conversational applications
A conversational application is a type of software that can interact with users using natural language. It can be used for various purposes, such as providing information, assistance, entertainment, or transactions. Generally speaking, a conversational application can use different modes of communication, such as text, voice, graphics, or even touch. A conversational application can also use different platforms, such as messaging apps, websites, mobile devices, or smart speakers.
Today, conversational applications are being taken to the next level thanks to LLMs. Let’s look at some of the benefits that they provide:
- Not only do LLMs provide a new level of natural language interactions, but they can also enable applications to perform reasoning based on the best responses, given users’ preferences.
- As we saw in previous chapters, LLMs can leverage their parametric knowledge, but are also enriched with non-parametric knowledge, thanks to embeddings and plug-ins.
- Finally, LLMs are also able to keep track of the conversation thanks to different types of memory.
The following image shows what the architecture of a conversational bot might look like:
Figure 6.1: Sample architecture of a conversational bot
Throughout this chapter, we will build from scratch a text conversational application that is able to help users plan their vacations. We will call this app GlobeBotter. We will add incremental layers of complexity to make the app as enjoyable as possible for the end user.
So, let’s start with the basics behind a conversational app architecture.
Creating a plain vanilla bot
To start with, let’s initialize our LLM and set the schema for our bot. The schema refers to the type of messages the bot is able to receive. In our case, we will have three types of messages:
- System message: The instructions we give the bot so that it behaves as a travel assistant.
- AI Message: The message generated by the LLM
- Human Message: The user’s query
Let’s start with a simple configuration:
from langchain.schema import (
AIMessage,
HumanMessage,
SystemMessage
)
from langchain.chains import LLMChain, ConversationChain
from langchain.chat_models import ChatOpenAI
from langchain.chat_models import ChatOpenAI
chat = ChatOpenAI()
messages = [
SystemMessage(content="You are a helpful assistant that help the user to plan an optimized itinerary."),
HumanMessage(content="I'm going to Rome for 2 days, what can I visit?")]
We can then save and print the output as follows:
output = chat(messages)
print(output.content)
Here is the output:
In Rome, there are many famous attractions to visit. Here's an optimized itinerary for your two-day trip:
Day 1:
1. Start your day by visiting the Colosseum, one of the world's most iconic ancient landmarks.
2. Next, explore the nearby Roman Forum, an ancient Roman marketplace.
3. Afterward, head to the Pantheon, a well-preserved Roman temple with a stunning dome.
4. Take a stroll through the historic district of Trastevere, known for its charming streets and authentic Roman atmosphere.
5. In the evening, visit the Trevi Fountain and toss a coin to ensure your return to Rome.
Day 2:
1. Begin your day at Vatican City, the smallest independent state in the world. Visit St. Peter's Basilica and admire Michelangelo's masterpiece, the Sistine Chapel.
2. Explore the Vatican Museums, home to an extensive collection of art and historical artifacts.
3. Enjoy a leisurely walk along the Tiber River and cross over to the picturesque neighborhood of Castel Sant'Angelo.
4. Visit the Spanish Steps, a popular meeting point with a beautiful view of the city.
5. End your day by exploring the charming neighborhood of Piazza Navona, known for its baroque architecture and lively atmosphere.
Remember to check the opening hours and availability of tickets for the attractions in advance. Enjoy your trip to Rome!
As you can see, the model was pretty good at generating an itinerary in Rome with only one piece of information from our side, the number of days.
However, we might want to keep interacting with the bot, so that we can further optimize the itinerary, providing more information about our preferences and habits. To achieve that, we need to add memory to our bot.
Adding memory
As we’re creating a conversational bot with relatively short messages, in this scenario, a ConversationBufferMemory
could be suitable. To make the configuration easier, let’s also initialize a ConversationChain
to combine the LLM and the memory components.
Let’s first initialize our memory and chain (I’m keeping verbose = True
so that you can see the bot keeping track of previous messages):
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain
memory = ConversationBufferMemory()
conversation = ConversationChain(
llm=chat, verbose=True, memory=memory
)
Great, now let’s have some interactions with our bot:
conversation.run("Hi there!")
The following is the output:
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI:
> Finished chain.
'Hello! How can I assist you today?'
Next, we provide the following input:
conversation.run("what is the most iconic place in Rome?")
Here is the corresponding output:
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI: Hello! How can I assist you today?
Human: what is the most iconic place in Rome?
AI:
> Finished chain.
'The most iconic place in Rome is probably the Colosseum. It is a magnificent amphitheater that was built in the first century AD and is one of the most recognizable symbols of ancient Rome. The Colosseum was used for gladiatorial contests, public spectacles, and other events. Today, it is a major tourist attraction and a UNESCO World Heritage site.'
As you can see from the chain, it is keeping track of the previous interactions. Let’s challenge it and ask something related to the previous context:
conversation.run("What kind of other events?")
The following is the output that we receive:
> Entering new ConversationChain chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI: Hello! How can I assist you today?
Human: what is the most iconic place in Rome?
AI: The most iconic place in Rome is probably the Colosseum. It is a magnificent amphitheater that was built in the first century AD and is one of the most recognizable symbols of ancient Rome. The Colosseum was used for gladiatorial contests, public spectacles, and other events. Today, it is a major tourist attraction and a UNESCO World Heritage site.
Human: What kind of other events?
AI:
> Finished chain.
'Other events that took place at the Colosseum include mock sea battles, animal hunts, and reenactments of famous battles. The Colosseum was also used for executions and religious ceremonies. It was a versatile venue that could accommodate a variety of events and entertainments.'
The bot was able to understand that our request was related to its previous answer. We can also retrieve the message history with the memory.load_memory_variables()
method (you can see the full output in the GitHub repository). Here is a snippet of the output:
{'history': 'Human: Hi there!\nAI: Hello! How can I assist you today?\nHuman: what is the most iconic place in Rome?....
Rather than running the conversation.run method at every interaction, I’ve coded a while
cycle to make it interactive. The following is a snapshot of the whole conversation (you can find it in the book’s GitHub repository):
while True:
query = input('you: ')
if query == 'q':
break
output = conversation({"input": query})
print('User: ', query)
print('AI system: ', output['response'])
The following is a truncated sample from the output (you can find the whole output in the book’s GitHub repository):
User: hello
AI system: Hello! How can I assist you today?
User: I'm planning a 1-day trip in Venice. What should I visit?
AI system: That sounds like a wonderful plan! In Venice, there are several must-visit attractions that you can explore in a day. Here's a suggested itinerary for your 1-day trip in Venice:
1. St. Mark's Square (Piazza San Marco): [...] Enjoy your trip to Venice!
User: thanks! I'm planning to be around also the next day, and I love hiking. Do you have any suggestion nearby?
AI system: Certainly! If you enjoy hiking and want to explore the natural beauty around Venice, there are a few options you can consider for the next day:
1. The Dolomites: [...]
User: which one is closer to Milan?
AI system: If you're looking for a hiking destination closer to Milan, the best option would be the Lombardy region [...]
As you can see, now the AI assistant is capable of keeping track of the whole conversation. In the next section, we are going to add yet another layer of complexity: an external knowledge base.
Adding non-parametric knowledge
Imagine that you also want your GlobeBotter to have access to exclusive documentation about itineraries that are not part of its parametric knowledge.
To do so, we can either embed the documentation in a VectorDB or directly use a retriever to do the job. In this case, we will use a vector-store-backed retriever using a particular chain, ConversationalRetrievalChain.
This type of chain leverages a retriever over the provided knowledge base that has the chat history, which can be passed as a parameter using the desired type of memory previously seen.
With this goal in mind, we will use a sample Italy travel guide PDF downloaded from https://www.minube.net/guides/italy.
The following Python code shows how to initialize all the ingredients we need, which are:
- Document Loader: Since the document is in PDF format, we will use
PyPDFLoader
. - Text splitter: We will use a
RecursiveCharacterTextSplitter
, which splits text by recursively looking at characters to find one that works. - Vector store: We will use the
FAISS
VectorDB. - Memory: We will use a
ConversationBufferMemory
. - LLMs: We will use the
gpt-3.5-turbo
model for conversations. - Embeddings: We will use the
text-embedding-ada-002
.
Let’s take a look at the code:
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.document_loaders import PyPDFLoader
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1500,
chunk_overlap=200
)
raw_documents = PyPDFLoader('italy_travel.pdf').load()
documents = text_splitter.split_documents(raw_documents)
db = FAISS.from_documents(documents, OpenAIEmbeddings())
memory = ConversationBufferMemory(
memory_key='chat_history',
return_messages=True
)
llm = ChatOpenAI()
Let’s now interact with the chain:
qa_chain = ConversationalRetrievalChain.from_llm(llm, retriever=db.as_retriever(), memory=memory, verbose=True)
qa_chain.run({'question':'Give me some review about the Pantheon'})
The following is the output (I’m reporting a truncated version. You can see the whole output in the book’s GitHub repository):
> Entering new StuffDocumentsChain chain...
> Entering new LLMChain chain...
Prompt after formatting:
System: Use the following pieces of context to answer the users question.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
cafes in the square. The most famous are the Quadri and
Florian.
Piazza San Marco,
Venice
4
Historical Monuments
Pantheon
Miskita:
"Angelic and non-human design," was how
Michelangelo described the Pantheon 14 centuries after its
construction. The highlights are the gigantic dome, the upper
eye, the sheer size of the place, and the harmony of the
whole building. We visited with a Roman guide which is
...
> Finished chain.
'Miskita:\n"Angelic and non-human design," was how Michelangelo described the Pantheon 14 centuries after its construction. The highlights
Note that, by default, the ConversationalRetrievalChain
uses a prompt template called CONDENSE_QUESTION_PROMPT
, which merges the last user’s query with the chat history, so that it results as just one query to the retriever. If you want to pass a custom prompt, you can do so using the condense_question_prompt
parameter in the ConversationalRetrievalChain.from_llm
module.
Even though the bot was able to provide an answer based on the documentation, we still have a limitation. In fact, with such a configuration, our GlobeBotter will only look at the provided documentation, but what if we want it to also use its parametric knowledge? For example, we might want the bot to be able to understand whether it could integrate with the provided documentation or simply answer freely. To do so, we need to make our GlobeBotter agentic, meaning that we want to leverage the LLM’s reasoning capabilities to orchestrate and invoke the available tools without a fixed order, but rather following the best approach given the user’s query.
To do so, we will use two main components:
create_retriever_tool
: This method creates a custom tool that acts as a retriever for an agent. It will need a database to retrieve from, a name, and a short description, so that the model can understand when to use it.create_conversational_retrieval_agent
: This method initializes a conversational agent that is configured to work with retrievers and chat models. It will need an LLM, a list of tools (in our case, the retriever), and a memory key to keep track of the previous chat history.
The following code illustrates how to initialize the agent:
from langchain.agents.agent_toolkits import create_retriever_tool
tool = create_retriever_tool(
db.as_retriever(),
"italy_travel",
"Searches and returns documents regarding Italy."
)
tools = [tool]
memory = ConversationBufferMemory(
memory_key='chat_history',
return_messages=True
)
from langchain.agents.agent_toolkits import create_conversational_retrieval_agent
from langchain.chat_models import ChatOpenAI
llm = ChatOpenAI(temperature = 0)
agent_executor = create_conversational_retrieval_agent(llm, tools, memory_key='chat_history', verbose=True)
Great, now let’s see the thought process of the agent with two different questions (I will report only the chain of thoughts and truncate the output, but you can find the whole code in the GitHub repo):
agent_executor({"input": "Tell me something about Pantheon"})
Here is the output:
> Entering new AgentExecutor chain...
Invoking: `italy_travel` with `Pantheon`
[Document(page_content='cafes in the square. The most famous are the Quadri and\nFlorian. […]
> Finished chain.
Let’s now try with a question not related to the document:
output = agent_executor({"input": "what can I visit in India in 3 days?"})
The following is the output that we receive:
> Entering new AgentExecutor chain...
In India, there are numerous incredible places to visit, each with its own unique attractions and cultural experiences. While three days is a relatively short time to explore such a vast and diverse country, here are a few suggestions for places you can visit:
1. Delhi: Start your trip in the capital city of India, Delhi. […]
> Finished chain.
As you can see, when I asked the agent something about Italy, it immediately invoked the provided document, while this was not done in the last question.
The last thing we want to add to our GlobeBotter is the capability to navigate the web, since, as travelers, we want to have up-to-date information about the country we are traveling to. Let’s implement it with LangChain’s tools.
Adding external tools
The tool we are going to add here is the Google SerpApi tool, so that our bot will be able to navigate the internet.
Note
SerpApi is a real-time API designed to access Google search results. It simplifies the process of data scraping by handling complexities such as managing proxies, solving CAPTCHAs, and parsing structured data from search engine results pages.
LangChain offers a pre-built tool that wraps SerpApi to make it easier to integrate it within your agents. To enable SerpApi, you need to sign in at https://serpapi.com/users/sign_up, then go to the dashboard under the tab API key.
Since we don’t want our GlobeBotter to be focused only on the web, we will add the SerpApi tool to the previous one, so that the agent will be able to pick the most useful tool to answer the question – or use no tool if not necessary.
Let’s initialize our tools and agent (you learned about this and other LangChain components in Chapter 5):
from langchain import SerpAPIWrapper
import os
from dotenv import load_dotenv
load_dotenv()
os.environ["SERPAPI_API_KEY"]
search = SerpAPIWrapper()
tools = [
Tool.from_function(
func=search.run,
name="Search",
description="useful for when you need to answer questions about current events"
),
create_retriever_tool(
db.as_retriever(),
"italy_travel",
"Searches and returns documents regarding Italy."
)
]
agent_executor = create_conversational_retrieval_agent(llm, tools, memory_key='chat_history', verbose=True)
Great, now let’s test it with three different questions (here, again, the output has been truncated):
- “What can I visit in India in 3 days?”
> Entering new AgentExecutor chain... India is a vast and diverse country with numerous attractions to explore. While it may be challenging to cover all the highlights in just three days, here are some popular destinations that you can consider visiting: 1. Delhi: Start your trip in the capital city of India, Delhi. […] > Finished chain.
In this case, the model doesn’t need external knowledge to answer the question, hence it is responding without invoking any tool.
- “What is the weather currently in Delhi?”
> Entering new AgentExecutor chain... Invoking: `Search` with `{'query': 'current weather in Delhi'}` Current Weather · 95°F Mostly sunny · RealFeel® 105°. Very Hot. RealFeel Guide. Very Hot. 101° to 107°. Caution advised. Danger of dehydration, heat stroke, heat ...The current weather in Delhi is 95°F (35°C) with mostly sunny conditions. The RealFeel® temperature is 105°F (41°C), indicating that it feels very hot. Caution is advised as there is a danger of dehydration, heat stroke, and heat-related issues. It is important to stay hydrated and take necessary precautions if you are in Delhi or planning to visit. > Finished chain.
Note how the agent is invoking the search tool; this is due to the reasoning capability of the underlying gpt-3.5-turbo model, which captures the user’s intent and dynamically understands which tool to use to accomplish the request.
- “I’m traveling to Italy. Can you give me some suggestions for the main attractions to visit?”
> Entering new AgentExecutor chain... Invoking: `italy_travel` with `{'query': 'main attractions in Italy'}` [Document(page_content='ITALY\nMINUBE TRAVEL GUIDE\nThe best must-see places for your travels, […] Here are some suggestions for main attractions in Italy: 1. Parco Sempione, Milan: This is one of the most important parks in Milan. It offers a green space in the city where you can relax, workout, or take a leisurely walk. […] > Finished chain.
Note how the agent is invoking the document retriever to provide the preceding output.
Overall, our GlobeBotter is now able to provide up-to-date information, as well as retrieving specific knowledge from curated documentation. The next step will be that of building a front-end. We will do so by building a web app using Streamlit.