Document ingestion with Amazon Bedrock
The architectural pattern for QA systems with context can be broadly divided into two categories – QA on small documents and QA on large documents on knowledge bases. While the core components remain similar, the approach and techniques employed may vary, depending on the size and complexity of the input data.
QA on small documents
For QA systems designed to handle small documents, such as paragraphs or short articles, the architectural pattern typically follows a pipeline approach consisting of the following stages:
- Query processing: The natural language query is preprocessed by converting it to a vector representation.
- Document retrieval: Relevant documents or passages are retrieved from the corpus based on the query keywords or semantic similarity measures. For smaller documents, retrieval can be straightforward; you can directly embed and index the entire document or passage within your vector store. In another scenario...