A RAG-driven generative AI pipeline
Let’s dive into what a real-life RAG pipeline looks like. Imagine we’re a team that has to deliver a whole system in just a few weeks. Right off the bat, we’re bombarded with questions like:
- Who’s going to gather and clean up all the data?
- Who’s going to handle setting up OpenAI’s embedding model?
- Who’s writing the code to get those embeddings up and running and managing the vector store?
- Who’s going to take care of implementing GPT-4 and managing what it spits out?
Within a few minutes, everyone starts looking pretty worried. The whole thing feels overwhelming—like, seriously, who would even think about tackling all that alone?
So here’s what we do. We split into three groups, each of us taking on different parts of the pipeline, as shown in Figure 2.3:
Figure 2.3: RAG pipeline components
Each of the three groups has one...