Why do we need RAG?
LLMs are restricted by their knowledge of the world through their training data, so ChatGPT doesn’t know about recent events or your own data, which severely restricts its ability to provide relevant answers. Things can also get worse with LLM performance because of hallucinations, where the LLM doesn’t have any knowledge to support a question, so it makes things up.
So, when we talk about an LLM’s knowledge, there are two types:
- Knowledge from information that the LLM used during training.
- Knowledge from information that was passed to the LLM via a prompt in the context of the conversation. We can call this context-specific knowledge.
So, the standout use case for an LLM application, and one that I’m asked about the most, is how we can allow an LLM to interpret and discuss data outside of their training dataset. This includes accessing real-time information or other external data sources, such as proprietary information...