Summary
This chapter began with a soft introduction to RAG and why and when you should use it. We also understood how embeddings and vector databases work, representing the cornerstone of any RAG system.
Then, we looked into advanced RAG and why we need it in the first place. We built a strong intuition about what parts of the RAG can be optimized and proposed some popular advanced RAG techniques for working with textual data.
Next, we applied everything we learned about RAG to designing the architecture of LLM Twin's RAG feature pipeline. We also understood the difference between a batch and streaming pipeline and presented a short introduction to the CDC pattern, which helps sync two databases.
Ultimately, we went step-by-step into the implementation of the LLM Twin's RAG feature pipeline, where we saw how to integrate ZenML as an orchestrator, how to design the domain entities of the application and how to implement an OVM module. Also, we understood how to apply some software...