Batch and real-time integration patterns
The first key decision when evaluating batch versus real-time integration approaches is around data immediacy – when exactly do you need the GenAI outputs? This boils down to whether you require responsive just-in-time results, like servicing on-demand user queries, versus use cases where insights from model outputs can accumulate over time before getting consumed.
Let’s illustrate this with an example of RAG, where LLMs evaluate search results to formulate human-friendly query responses. This is a real-time use case; you need AI-generated answers with minimal latency to deliver a quality user experience in applications like conversational assistants or search engines. The data has to be put into action as it is produced.
Contrast that with something like automated content generation workflows, for example, extracting metadata from a product catalog. While you still want that content quickly, there’s more flexibility...