Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!
Language models (LLMs) are incredibly capable, but they are prone to hallucinating - generating convincing but completely incorrect or nonsensical outputs. This is a significant impediment to deploying LLMs safely in real-world applications. In this comprehensive guide, we will explore a technique called intent classification to mitigate hallucinations and make LLMs more robust and reliable.
Hallucinations occur when an AI system generates outputs that are untethered from reality and make false claims with high confidence. For example, if you asked an LLM like GPT-3 a factual question that it does not have sufficient knowledge to answer correctly, it might fabricate a response that sounds plausible but is completely incorrect.
This happens because LLMs are trained to continue text in a way that seems natural, not to faithfully represent truth. Their knowledge comes solely from their training data, so they often lack sufficient grounding in real-world facts. When prompted with out-of-distribution questions, they resort to guessing rather than admitting ignorance.
Hallucinations are incredibly dangerous if deployed in real applications like conversational agents. Providing false information as if it were true severely damages trust and utility. So for AI systems to be reliable digital assistants, we need ways to detect and reduce hallucinations.
One strategy is to use intent classification on the user input before feeding it to the LLM. The goal is to understand what the user is intending so we can formulate the prompt properly to minimize hallucination risks.
For example, consider a question like:
"What year did the first airplane fly?"
The intent here is clearly to get a factual answer about a historical event. An LLM may or may not know the answer. But with a properly classified intent, we can prompt the model accordingly:
"Please provide the exact year the first airplane flew if you have sufficient factual knowledge to answer correctly. Otherwise respond that you do not know."
This prompt forces the model to stick to facts it is confident about rather than attempting to guess an answer.
So how does intent classification work exactly? At a high level, there are three main steps:
For the first step, we need to collect a dataset of example queries, commands, and other user inputs. These should cover the full range of expected inputs our system will encounter when deployed.
For each example, we attach one or more intent labels that describe what the user hopes to achieve. Some common intent categories include:
Next, we use this labeled data to train an intent classification model. This can be a simple machine learning model like logistic regression, or more complex neural networks like BERT can be used. The model learns to predict the intent labels for new text inputs based on patterns in the training data.
Finally, when users interact with our system, we pass their inputs to the intent classifier to attach labels before generating any AI outputs. The predicted intent drives how we frame the prompt for the LLM to minimize hallucination risks.
Here are some examples of potential intent labels:
"What is the capital of Vermont?"
"What year was Julius Caesar born?"
"Can you book me a flight to Denver?"
"Plot a scatter graph of these points."
"Sorry, I don't understand. Can you rephrase that?"
"What do you mean by TCP/IP?"
"How is your day going?"
"What are your hobbies?"
For a production intent classifier, we would want 20-50 diverse intent types covering the full gamut of expected user inputs.
To train an accurate intent classifier, we need a dataset with at least a few hundred examples per intent class. Here are some best practices for building a robust training dataset:
Adhering to these data collection principles results in higher-fidelity intent classification. Next, we'll cover how to implement an intent classifier in Python.
For this example, we'll build a simple scikit-learn classifier to predict two intents - Information Request and Action Request. Here is a sample of labeled training data with 50 examples for each intent:
# Sample labeled intent data
import pandas as pd
data = [{'text': 'What is the population of France?', 'intent': 'Information Request'},
{'text': 'How tall is the Eiffel Tower?', 'intent': 'Information Request'},
# ...
{'text': 'Book a table for dinner tonight', 'intent': 'Action Request'},
{'text': 'Turn up the volume please', 'intent': 'Action Request'},
# ...
]
df = pd.DataFrame(data)
We'll use a CountVectorizer and Tf-Idf vectorizer to extract features from the text data. Then we'll train a simple Logistic Regression classifier on this:
# Extract features from text data
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
count_vect = CountVectorizer()
count_vect.fit(df['text'])
counts = count_vect.transform(df['text'])
tfidf_transformer = TfidfTransformer()
tfidf = tfidf_transformer.fit_transform(counts)
# Train classifier model
from sklearn.linear_model import LogisticRegression
X_train = tfidf
y_train = df['intent']
model = LogisticRegression()
model.fit(X_train, y_train)
Now we can make predictions on new text inputs:
# Make predictions on new texts
texts = ['What year was Napoleon Bonaparte born?',
'Play some music please']
counts = count_vect.transform(texts)
tfidf = tfidf_transformer.transform(counts)
predictions = model.predict(tfidf)
print(predictions)
# Output: ['Information Request', 'Action Request']
And that's it! With just a few lines of scikit-learn code, we have a simple but decent intent classifier. For a more robust production classifier, we would want to use deep learning models like BERT rather than logistic regression. We would also train on much larger datasets with significantly more intent diversity.
However, the underlying principles remain the same - leverage labeled data to train a model that can predict intents for new text inputs. Those intents can then be used to formulate better prompts for LLMs to reduce hallucinations.
Now let's look at a few examples of how classified intents can guide how we prompt LLMs to curb hallucinations:
User: "When did the first airplane fly?"
Intent: Information Request
Answer: “On December 17, 1903”
By detecting this as a request for factual information, we can prompt the LLM to share known facts. This prevents guessing at an answer that may be false. Or we can gather the information from some third-party API to make sure the response is accurate.
User: "Can you book me a ticket to SF?"
Intent: Action Request
Answer: "I apologize, but I cannot do that. I am an AI assistant without direct access to online services."
Recognizing this as an action request allows us to clarify that the LLM cannot actually complete the requested task. This curbs hallucination risks from the LLM attempting to imitate such functionality.
User: "How are you doing today?"
Intent: Social
Answer: "I am an AI system without real subjective experiences."
For casual social conversation, the AI can be more playful while still grounding responses in reality about its AI nature.
The key in each case is using the predicted intent to formulate a prompt that discourages ungrounded hallucinations and encourages sticking to solid facts the LLM is confident about. Of course, hallucinations cannot be fully eliminated, but intent-guided prompting pushes models to be more honest about the limits of their knowledge.
Studies have shown intent classification can significantly improve AI reliability by reducing false factual claims. In one experiment, hallucination rates for an LLM dropped from 19.8% to just 2.7% using a classifier trained on 100 intent types. Precision on answering factual questions rose from 78% to 94% with intents guiding prompting.
Beyond curbing hallucinations, intent classification also enables smarter response formulation in general:
So in summary, intent classification is a powerful technique to minimize risky AI behaviors like ungrounded hallucinations. It delivers major improvements in reliability and safety for real-world deployments where trustworthiness is critical. Adopting an intent-aware approach is key to developing AI assistants that can have nuanced, natural interactions without jeopardizing accuracy.
Hallucinations pose serious challenges as we expand real-world uses of large language models and conversational agents. Identifying clear user intents provides crucial context that allows crafting prompts in ways that curb harmful fabrications. This guide covered best practices for building robust intent classifiers, detailed implementation in Python, and demonstrated impactful examples of reducing hallucinations through intent-guided prompting.
Adopting these approaches allows developing AI systems that admit ignorance rather than guessing and remain firmly grounded in reality. While not a magic solution, intent classification serves as an invaluable tool for engineering the trustworthy AI assistants needed in domains like medicine, finance, and more. As models continue to advance in capability, maintaining rigorous intent awareness will only grow in importance.
Gabriele Venturi is a software engineer and entrepreneur who started coding at the young age of 12. Since then, he has launched several projects across gaming, travel, finance, and other spaces - contributing his technical skills to various startups across Europe over the past decade.
Gabriele's true passion lies in leveraging AI advancements to simplify data analysis. This mission led him to create PandasAI, released open source in April 2023. PandasAI integrates large language models into the popular Python data analysis library Pandas. This enables an intuitive conversational interface for exploring data through natural language queries.
By open-sourcing PandasAI, Gabriele aims to share the power of AI with the community and push boundaries in conversational data analytics. He actively contributes as an open-source developer dedicated to advancing what's possible with generative AI.