Bringing it all together
Having delved into the various technical components separately within the generative AI technical stack, let’s now consolidate them into a unified perspective.
Figure 16.8: Generative AI tech stack
In summary, a generative AI platform is an extension of an ML platform by introducing additional capabilities such as prompt management, input/output filtering, and tools for FM evaluation and RLHF workflows. To accommodate these enhancements, the ML platform’s pipeline capability will need to include new generative AI workflows. The new RAG infrastructure will form the foundational backbone of RAG-based LLM applications and will be closely integrated with the underlying generative AI platform.
The development of generative AI applications will continue to leverage other core application architecture components, including streaming, batch processing, message queuing, and workflow tools.
Although many of the core components will likely possess their unique set of security and governance capabilities, there will be an overarching need for comprehensive end-to-end observability, monitoring, security, and governance for generative AI application development and operation at scale.