Standardized evaluation frameworks
Key technical components of your RAG system include the embedding model that makes your embeddings, the vector store, the vector search, and the LLM. When you look at the different options for each technical component, there are a number of standardized metrics that are available for each that help you compare them against each other. Here are some common metrics for each category.
Embedding model benchmarks
The Massive Text Embedding Benchmark (MTEB) Retrieval Leaderboard evaluates the performance of embedding models on various retrieval tasks across different datasets. The MTEB leaderboard ranks models based on their average performance across many embedding and retrieval-related tasks. You can visit the leaderboard using this link: https://huggingface.co/spaces/mteb/leaderboard
When visiting this web page, click on the Retrieval and Retrieval w/Instructions tabs for retrieval-specific embedding ratings. To evaluate each of the models on...