4 Ways to Treat a Hallucinating AI with Prompt Engineering

Hey there, fellow AI enthusiast! Are you tired of your LLM (Large Language Model) creating random, nonsensical outputs? Fear not, because today I’m opening the box of prompt engineering pills looking for something to help you reduce those pesky hallucinations.

First, let's break down what we're dealing with. Prompt engineering is the art of creating input prompts for AI models in a way that guides them towards generating more accurate, relevant, and useful responses. Think of it as gently nudging your AI model in the right direction, so it doesn't end up lost in a sea of information. The word “engineering” was probably not the wisest choice in many people’s opinion but that’s already history as everybody got used to it as it is. In my opinion, it’s more of a mix of logical thinking, creativity, language, and problem-solving skills. It feels a lot like writing code but using just natural language instead of structured syntax and vocabulary. While the user gets the freedom of using their own language and depth, with great freedom comes great responsibility. An average prompt will probably result in an average answer. The issue I’m addressing in this article is just one example from the many pitfalls that can be avoided with some basic prompt hygiene when interacting with AI.

Now, onto the bizarre world of hallucinations. In the AI realm, hallucinations refer to instances when an AI model (particularly LLMs) generates output that is unrelated, implausible, or just plain weird. Some of you may have been there already, asking an AI model like GPT-3 to write a paragraph about cats, only to get a response about aliens invading Earth! And while the issue has been greatly mitigated in GPT-4 and similar newer AI models, it’s still something to be concerned about, especially if you’re looking for precise, fact-based responses. To make matters worse, sometimes the hallucinated answer sounds very convincing and seems to be plausible in the given context.

For example, when asked the name of the Voodoo Lady in the Monkey Island series of games ChatGPT provides a series of convincing answers, all of which are wrong:

4-ways-to-treat-a-hallucinating-ai-with-prompt-engineering-img-0

It’s a bit of a trick question, as she is simply known as the Voodoo Lady in the original series of games, but you can see how convinced ChatGPT is of the answers that it provides (and continued to provide). If I hadn’t already known the answer, then I never would have known that ChatGPT was hallucinating.

What Are the Technical Reasons Why AI Models Hallucinate?

Training Data: Machine learning models are trained on vast amounts of text data from diverse sources. This data may contain inconsistencies, noise, and biases. As a result, when generating text, the model might output content that is influenced by these inconsistencies or noise, leading to hallucinations.
Probabilistic Nature: Generative models like GPTs are based on probabilistic techniques that predict the next token (e.g., word or character) in a sequence, given the context. They estimate the likelihood of each token appearing and sample tokens based on these probabilities. If you’ve ever watched “Family Feud” on TV, you get a pretty good idea of what token prediction means. This sampling process can sometimes result in unpredictable and implausible outputs, as the model might choose less likely tokens, generating hallucinations. To make matters worse, GPTs are usually not built to say "I don't know" when they lack information. Instead, they produce the most likely answer.

4-ways-to-treat-a-hallucinating-ai-with-prompt-engineering-img-1

Lack of Ground Truth: Unlike supervised learning tasks where there is a clear ground truth for the model to learn from, generative tasks do not have a single correct output. Most LLMs that we use do not have the capability to check the facts in their output against a real-time validated source as they do not have Internet access. The absence of a ground truth can make it difficult for the model to learn constraints and discern what is plausible or correct, leading to the generation of hallucinated content.
Optimization Challenges: During training, the models are optimized using a loss function that measures the discrepancy between the generated output and the expected outcome. In generative tasks, this loss function may not always capture the nuances of human language, making it difficult for the model to learn the correct patterns and avoid hallucinations.
Model Complexity: State-of-the-art generative models like GPT-3 have billions of parameters that make them highly expressive and capable of capturing complex patterns in the data. However, this complexity can also result in overfitting and memorization of irrelevant or spurious patterns, causing hallucinations in generated outputs.

So, clearly, we have a problem to solve. Here are four tips for how to improve your prompts and get better responses from ChatGPT.

Four Tips for Improving Your Prompts

Not being clear and specific in your prompts
To get the best results, you must clearly understand the problem yourself first. Make sure you know what you want to achieve and keep your prompts focused on that objective. The more explicit your prompt, the better the AI model can understand what you're looking for. So instead of asking, "Tell me about the Internet," try something like, "Explain how the Internet works and its importance in modern society." By doing this, you're giving your AI model a clearer picture of what you want. Sometimes you’ll have to make your way through multiple prompt iterations to get the result you’re after. Sometimes results you'll get may steer away from the initial topic. Make sure to stay on track and avoid deviating from the task at hand. Make sure you bring the conversation back in focus, otherwise the hallucination effect may amplify.
Ignoring the power of an example
Everyone loves examples they say, even AI models! Providing examples in your prompt helps your model understand the context and generate more accurate responses. For instance, "Write a brief history of Python, similar to how the history of Java is described in this article {example}" This not only gives the AI a clear topic but also a reference point to follow. Providing a well-structured example can also save you a lot of time in explaining the output you’re expecting to receive. Without an example your prompt might be too generic, allowing too much freedom in interpretation. Think about it like a conversation. Sometimes, the best approach to make yourself understood by the other party is to provide an example. Do you want to make sure there’s no misunderstanding from the start? Include an example in your initial prompt.
Not following “Divide et Impera”
Have you ever tried to build IKEA furniture without instructions? It's a bit like that for AI models dealing with complex prompts. Too many nuts and bolts to keep track of. Too many variables to consider. Instead of asking the model to "Explain the process of creating a neural network," break it down into smaller, more manageable tasks like, "Step 1: Define the problem, Step 2: Collect and prepare data," and so on. This way, the AI can tackle each step individually and generate more coherent outputs. It’s also very useful when you are trying to generate a more verbose and comprehensive response and not just a simple factual answer. You can, of course, combine both approaches asking the AI to provide the steps first, and then asking for more information on each step.
Relying on the first response you receive
As most LLMs in use today do not provide enough transparency in their reasoning process, working with them sometimes feels like interacting with a magic box. The non-deterministic nature of generative AI can further amplify this problem, so when you need precision it's best to experiment with various prompt formats and compare the results. Pro tip: some open-source models can already be queried in parallel using this website: Or, when interacting with a single AI model, try multiple approaches for your query like rephrasing the prompt, asking a question or presenting it as a statement.

For example, if you're looking for information about cloud computing, you could try:

"What is cloud computing and how does it work?"
"Explain cloud computing and its benefits."
"Cloud computing has transformed the IT industry; discuss its impact and future potential."

Some LLMs, such as Google's Bard, provide multiple responses by default so you can pick the most suitable from among them.

4-ways-to-treat-a-hallucinating-ai-with-prompt-engineering-img-8

Compare the outputs. Validate any important facts with other independent sources. Look for implausible or weird responses. Although a hallucination is possible, by using different prompts you’ll greatly reduce the probability of generating the same hallucination every time and therefore it’s going to be easier to detect it.

Returning to our Voodoo Lady example earlier, by rephrasing the question we can get the right answer from ChatGPT.

4-ways-to-treat-a-hallucinating-ai-with-prompt-engineering-img-9

And there you have it! By trying to avoid these common mistakes you'll be well on your way to minimizing AI hallucinations and getting the output you're looking for. We all know how fast and unpredictable this domain can be, so the best approach is to learn together and share best practices among the community. The best prompt engineering books have not yet been written and there’s a ton of new things to learn about this emergent technology, so let’s stay in touch and share our findings!

Happy prompting!

About the Author

Andrei Gheorghiu is an experienced trainer with a passion for helping learners achieve their maximum potential. He always strives to bring a high level of expertise and empathy to his teaching.

With a background in IT audit, information security, and IT service management, Andrei has delivered training to over 10,000 students across different industries and countries. He is also a Certified Information Systems Security Professional and Certified Information Systems Auditor, with a keen interest in digital domains like Security Management and Artificial Intelligence.

In his free time, Andrei enjoys trail running, photography, video editing and exploring the latest developments in technology.

You can connect with Andrei on: