Evaluate using RAGAs
This book is about design, so product people are not expected to implement the RAGAs. RAGAs is a framework for evaluating the RAG pipeline. Any approach that takes test data, is actually used, and can measure quality reliably is fine with me. RAGAs is popular with the AI community, so it is worth covering. Call on product experts to evaluate results to validate findings. The goal is to understand the metrics and make decisions to deliver model improvements.
The RAGAs process
All good stories start at the beginning. An LLM product needs to be evaluated. Don’t wait for customers to complain; it comes too late, and customers disappear quickly if they are frustrated with quality. This is similar to phone support; when a customer has a horrible interaction, they tend to tell 20 people how bad it was, and this lack of goodwill hurts the company’s reputation. If backend systems or recommenders miss their mark, it will leave a foul taste in customers...