End-to-end evaluation
Beyond providing the metrics for evaluating each stage of the RAG pipeline in isolation, ragas provides metrics for the entire RAG system, called end-to-end evaluation. For the generation stage, ragas has two metrics, called answer correctness and answer similarity, as you see here in the last part of the output and charts:
**End-to-end evaluation**: Similarity Run Hybrid Run Difference answer_correctness 0.776018 0.717365 0.058653 answer_similarity 0.969899 0.969170 0.000729
The chart in Figure 9.4 shows the visualization for these results:

Figure 9.4 – Chart showing end-to-end performance comparison...