The architecture of fine-tuning static RAG data
In this section, we question the usage of non-parametric RAG data when it exceeds a manageable threshold, as described in the RAG versus fine-tuning section in Chapter 1, Why Retrieval Augmented Generation?, which stated the principle of a threshold. Figure 9.1 adapts the principle to this section:
Figure 9.1: Fine-tuning threshold reached for RAG data
Notice that the processing (D2) and storage (D3) thresholds have been reached for static data versus the dynamic data in the RAG data environment. The threshold depends on each project and parameters such as:
- The volume of RAG data to process: Embedding data requires human and machine resources. Even if we don’t embed the data, piling up static data (data that is stable over a long period of time) makes no sense.
- The volume of RAG data to store and retrieve: At some point, if we keep stacking data up, much of it may overlap.
- The retrievals require...