Poisoning embeddings in RAG
In Chapter 12, we looked at stored prompt injections, where an attacker stores carefully crafted payloads in data used in RAG to perform an indirect prompt injection. Stored injections and data poisoning are similar in many ways. We usually associate stored injections with inference time, whereas poisoning is related to model training.
The poisoning the embedding RAG uses is a more sophisticated version of stored prompt injections. Before we explain the attack, let’s briefly overview embeddings and how they work in the context of RAG.
Embeddings are fundamental in machine learning (ML) and NLP. They enable complex, high-dimensional data (such as text) to be represented in a lower-dimensional, dense vector space, typically in numbers. These vector representations capture semantic relationships between data points, such as the similarity or contextuality of words, sentences, or documents. The transformation into embeddings allows models to process...