Architecture
To build our RAG-based chatbot system, we’ll leverage a serverless, event-driven architecture built on Google Cloud. This approach aligns with the cloud-native principles we have used in previous examples and allows for seamless integration with other cloud services. You can dive deep into a Google Cloud example in this sample architecture: https://cloud.google.com/architecture/rag-capable-gen-ai-app-using-vertex-ai.
For the purpose of this example, the architecture consists of the following key components:
- Ingestion layer: This layer is responsible for accepting incoming user queries from various channels, such as web forms, chat interfaces, or API endpoints. We’ll use Google Cloud Functions as the entry point for our system, which can be triggered by events from services like Cloud Storage, Pub/Sub, or Cloud Run.
- Document corpus management: In this layer, we’ll store embeddings representing the content of the documents. In this...