Your (soon-to-be) intelligent app
With LLMs, embedding models, vector databases, and model hosting, you have the key building blocks for creating intelligent applications. While the specific architecture will vary depending on your use case, a common pattern emerges:
- LLMs for reasoning and generation
- Embeddings and vector search for retrieval and memory
- Model hosting to serve these components at scale
This AI stack is integrated with traditional application components, such as backend services, APIs, frontend user interfaces, databases, and data pipelines. Additionally, intelligent applications often include components for AI-specific concerns, such as prompt management and optimization, data preparation and embedding generation, and AI safety, testing, and monitoring.
The rest of this section walks through an example architecture for a RAG-powered chatbot, showcasing how these components work together. The subsequent chapters will dive deeper into the end...