The role of vector DBs in retrieval-augmented generation (RAG)
To fully understand RAG and the pivotal role of vector DBs within it, we must first acknowledge the inherent constraints of LLMs, which paved the way for the advent of RAG techniques powered by vector DBs. This section sheds light on the specific LLM challenges that RAG aims to overcome and the importance of vector DBs.
First, the big question – Why?
In Chapter 1, we delved into the limitations of LLMs, which include the following:
- LLMs possess a fixed knowledge base determined by their training data; as of February 2024, ChatGPT’s knowledge is limited to information up until April 2023.
- LLMs can occasionally produce false narratives, spinning tales or facts that aren’t real.
- They lack personal memory, relying solely on the input context length. For example, take GPT4-32K; it can only process up to 32K tokens between prompts and completions (we’ll dive deeper into prompts...