Understanding how Amazon Textract can help
We covered AWS AI Services briefly in Chapter 1, NLP in the Business Context and Introduction to AWS AI Services, when introducing the business context for NLP. Amazon Textract is an OCR-based service in the AWS AI Services stack that comes with ready-made intelligence, enabling you to use it without any prior ML experience for your document processing workflows. It is interesting to note that Amazon Textract has its origins in the deep learning ML models built for Amazon.com. It comes with a pre-trained model and provides APIs where you can send your documents in PDF or image format and get a response as text/tables and key/value pairs along with a confidence score.
Note
Amazon Textract currently supports PNG, JPEG, and PDF formats.
Amazon Textract provides serverless APIs without you needing to manage any kind of infrastructure, enabling you to quickly automate document management and scale to process millions of documents. Once...