Data flow
Data flow involves the movement of data through a system, affecting the accuracy, relevance, and speed of the results delivered to consumers, which, in turn, influences their engagement. This section explores design considerations for handling data sources, processing data, prompting LLMs, and embedding models to enrich data using MDN as an example. Figure 6.5 illustrates this flow.
Figure 6.5: Typical data flow in an AI/ML application
Let's us begin with the design for handling data sources. Data can be ingested into MongoDB Atlas either statically (at rest) from files as it is, or dynamically (in motion), allowing for continuous updates, data transformation, and logic execution.
Handling static data sources
The simplest way to import static data is to use mongoimport
, which supports JSON, CSV, and TSV formats. It is ideal for initial loads or bulk updates as it can handle large datasets. Moreover, increasing the number of insertion...