Architecture
Though the scope of this book is not to provide a deep dive into the intricacies of a LLM processing architecture, we will briefly discuss what a cloud-based architecture for our metadata extraction use case might look like. For this example, we will leverage the capabilities of Google Cloud, as it offers a native AI platform called Vertex AI that allows us to seamlessly integrate leading models, including Google’s Gemini and third-party models such as Anthropic’s Claude, in an enterprise-compliant manner.
The approach we’ll adopt for this use case is to leverage a batch-optimized architecture, which is suitable for processing large volumes of data in an efficient and scalable manner. This kind of architecture aligns with cloud-native principles and is a serverless architecture that leverages various Google Cloud services.
This architecture will consist of an object store (Google Cloud Storage) to store the 10-K reports, a messaging queue...