You're reading from Python Natural Language Processing Cookbook Over 60 recipes for building powerful NLP solutions using Python and LLM libraries

Product type Paperback

Published in Sep 2024

Publisher Packt

ISBN-13 9781803245744

Length 312 pages

Edition 2nd Edition

Languages

Processing

Tools

Processing

Concepts

GPT/LLMs

Authors (2):

Saurabh Chakravarty

Zhenya Antić

View More author details

Table of Contents (13) Chapters

Preface

1. Chapter 1: Learning NLP Basics

2. Chapter 2: Playing with Grammar FREE CHAPTER

3. Chapter 3: Representing Text – Capturing Semantics

4. Chapter 4: Classifying Texts

5. Chapter 5: Getting Started with Information Extraction

6. Chapter 6: Topic Modeling

7. Chapter 7: Visualizing Text Data

8. Chapter 8: Transformers and Their Applications

9. Chapter 9: Natural Language Understanding

10. Chapter 10: Generative AI and Large Language Models

11. Index

Why subscribe?

12. Other Books You May Enjoy

Creating a chatbot using an LLM

In this recipe, we will create a chatbot using the LangChain framework. In the previous recipe, we learned how to ask questions to an LLM based on a piece of content. Though the LLM was able to answer questions accurately, the interaction with the LLM was completely stateless. The LLM looks at each question in isolation and ignores any previous interactions or questions that it was asked. In this recipe, we will use an LLM to create a chat interaction, wherein the LLM will be aware of the previous conversations and use the context from them to answer subsequent questions. Applications of such a framework would be to converse with document sources and get to the right answer by asking a series of questions. These document sources could be of a wide variety of types, from internal company knowledge bases to customer contact center troubleshooting guides. Our goal here is to present a basic step-by-step framework to demonstrate the essential components working together to achieve the end goal.

Getting ready

We will use a model from OpenAI in this recipe. Please refer to Model access under the Technical requirements section to complete the step to access the OpenAI model. You can use the 10.5_chatbot_with_llm.ipynb notebook from the code site if you want to work from an existing notebook.

How to do it…

The recipe does the following things:

It initializes the ChatGPT LLM and an embedding provider. The embedding provider is used to vectorize the document content so that a vector-based similarity search can be performed.
It scrapes content from a webpage and breaks it into chunks.
The text in the document chunks is vectorized and stored in a vector store.
A conversation is started with the LLM via some curated prompts and a follow-up question is asked based on the answer provided by the LLM in the previous context.

Let’s get started:

Do the necessary imports:

import bs4
import getpass
import os
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_core.messages import AIMessage, HumanMessage, BaseMessage
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_openai import ChatOpenAI
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import (
    ChatPromptTemplate, MessagesPlaceholder)
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.prompts import ChatPromptTemplate

In this step, we initialize the gpt-4o-mini model from OpenAI using the ChatOpenAI initializer:
```
os.environ["OPENAI_API_KEY"] = getpass.getpass()
llm = ChatOpenAI(model="gpt-4o-mini")
```
In this step, we load the embedding provider. The content from the webpage is vectorized via the embedding provider. We use the pre-trained sentence-transformers/all-mpnet-base-v2 model using the call to the HuggingFaceEmbeddings constructor call. This model is a good one for encoding short sentences or a paragraph. The encoded vector representation captures the semantic context well. Please refer to the model card at https://huggingface.co/sentence-transformers/all-mpnet-base-v2 for more details:
```
embeddings_provider = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2")
```
In this step, we will load a web page that has content based on which we want to ask questions. You are free to choose any webpage of your choice. We initialize a WebBaseLoader object and pass it the URL. We call the load method for the loader instance. Feel free to change the link to any other webpage that you might want to use as the chat knowledge base:
```
loader = WebBaseLoader(
    ["https://lilianweng.github.io/posts/2023-06-23-agent/"])
docs = loader.load()
```
Initialize the text splitter instance of the RecursiveCharacterTextSplitter type. Use the text splitter instance to split the documents into chunks:
```
text_splitter = RecursiveCharacterTextSplitter()
document_chunks = text_splitter.split_documents(docs)
```
We initialize the vector or embedding store from the document chunks that we created in the previous step. We pass it the document chunks and the embedding provider. We also initialize the vector store retriever and the output parser. The retriever will provide the augmented content to the chain via the vector store. We provided more details in the Augmenting the LLM with external content recipe from this chapter. To avoid repetition, we recommend referring to that recipe:
```
vectorstore = FAISS.from_documents(
    all_splits,
    HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-mpnet-base-v2")
)
retriever = vectorstore.as_retriever(search_type="similarity")
```
In this step, we initialize a contextualized system prompt. A system prompt defines the persona and the instruction that is to be followed by the LLM. In this case, we use the system prompt to contain the instruction that the LLM has to use the chat history to formulate a standalone question. We initialize the prompt instance with the system prompt definition and set it up with the expectation that it will have access to the chat_history variable that will be passed to it at run time. We also set it up with the question template that will also be passed at run time:
```
contextualize_q_system_prompt = """Given a chat history and the latest user question \
which might reference context in the chat history, formulate a standalone question \
which can be understood without the chat history. Do NOT answer the question, \
just reformulate it if needed and otherwise return it as is."""
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{question}"),
    ]
)
```
In this step, we initialize the contextualized chain. As you can see in the previous code snippet, we are setting up the prompt with the context and the chat history. This chain uses the chat history and a given follow-up question from the user and sets up the context for it as part of the prompt. The populated prompt template is sent to the LLM. The idea here is that the subsequent question will not provide any context and ask the question based on the chat history generated so far:
```
contextualize_q_chain = contextualize_q_prompt | llm 
    | output_parser
```

In this step, we initialize a system prompt, much like in the previous recipe, based on RAG. This prompt just sets up a prompt template. However, we pass this prompt a contextualized question as the chat history grows. This prompt always answers a contextualized question, barring the first one:

qa_system_prompt = """You are an assistant for question-answering tasks. \
Use the following pieces of retrieved context to answer the question. \
If you don't know the answer, just say that you don't know. \
Use three sentences maximum and keep the answer concise.\
{context}"""
qa_prompt = ChatPromptTemplate.from_messages(
    [("system", qa_system_prompt),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{question}"),])

We initialize two helper methods. The contextualized_question method returns the contextualized chain if a chat history exists; otherwise, it returns the input question. This is the typical scenario for the first question. Once the chat_history is present, it returns the contextualized chain. The format_docs method concatenates the page content for each document separated by two newline characters:
```
def contextualized_question(input: dict):
    if input.get("chat_history"):
        return contextualize_q_chain
    else:
        return input["question"]
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)
```
In this step, we set up a chain. We use the RunnablePassthrough class to set up the context. The RunnablePassthrough class allows us to pass the input or add additional data to the input via dictionary values. The assign method will take a key and will assign the value to this key. In this case, the key is context and the assigned value for it is the result of the chained evaluation of the contextualized question, the retriever, and the format_docs. Putting that into the context of the entire recipe, for the first question, the context will use the set of matched records for the question. For the second question, the context will use the contextualized question from the chat history, retrieve a set of matching records, and pass that as the context. The LangChain framework uses a deferred execution model here. We set up the chain here with the necessary constructs such as context, qa_prompt, and the LLM. This is just setting the expectation with the chain that all these components will pipe their input to the next component when the chain is invoked. Any placeholder arguments that were set as part of the prompts will be populated and used during invocation:
```
rag_chain = (
        RunnablePassthrough.assign(
            context=contextualized_question | retriever | format_docs)
        | qa_prompt
        | llm
)
```
In this step, we initialize a chat history array. We ask a simple question to the chain by invoking it. What happens internally is the question is essentially just the first question since there is no chat history present at this point. The rag_chain just answers the question simply and prints the answer. We also extend the chat_history with the returned message:
```
chat_history = []
question = "What is a large language model?"
ai_msg = rag_chain.invoke(
    {"question": question, "chat_history": chat_history})
print(ai_msg)
chat_history.extend([HumanMessage(content=question), 
    AIMessage(content=ai_msg)])
```
This results in the following output:

A large language model (LLM) is an artificial intelligence system designed to understand and generate human-like text based on the input it receives. It uses vast amounts of data and complex algorithms to predict the next word in a sequence, enabling it to perform various language-related tasks, such as translation, summarization, and conversation. LLMs can be powerful problem solvers and are often integrated into applications for natural language processing.

In this step, we invoke the chain again with a subsequent question, without providing many contextual cues. We provide the chain with the chat history and print the answer to the second question. Internally, the rag_chain and the contextualize_q_chain work in tandem to answer this question. The contextualize_q_chain uses the chat history to add more context to the follow-up question, retrieves matched records, and sends that as context to the rag_chain. The rag_chain used the context and the contextualized question to answer the subsequent question. As we observe from the output, the LLM was able to decipher what it means in this context:
```
second_question = "Can you explain the reasoning behind calling it large?"
second_answer = rag_chain.invoke({"question": second_question,
    "chat_history": chat_history})
print(second_answer)
```
This results in the following output:

The term "large" in large language model refers to both the size of the model itself and the volume of data it is trained on. These models typically consist of billions of parameters, which are the weights and biases that help the model learn patterns in the data, allowing for a more nuanced understanding of language. Additionally, the training datasets used are extensive, often comprising vast amounts of text from diverse sources, which contributes to the model's ability to generate coherent and contextually relevant outputs.

Note:

We provided a basic workflow for how to execute RAG-based flows. We recommend referring to the LangChain documentation and using the necessary components to run solutions in production. Some of these would include evaluating other vector DB stores, using concrete types such as BaseChatMessageHistory and RunnableWithMessageHistory to better manage chat histories. Also, use LangServe to expose endpoints to serve requests.

The rest of the chapter is locked

You're reading from Python Natural Language Processing Cookbook Over 60 recipes for building powerful NLP solutions using Python and LLM libraries

Table of Contents (13) Chapters

Creating a chatbot using an LLM

Getting ready

How to do it…

Unlock this book and the full library FREE for 7 days

Authors (2)

Personalised recommendations for you