Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases now! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Natural Language Processing with AWS AI Services

You're reading from   Natural Language Processing with AWS AI Services Derive strategic insights from unstructured data with Amazon Textract and Amazon Comprehend

Arrow left icon
Product type Paperback
Published in Nov 2021
Publisher Packt
ISBN-13 9781801812535
Length 508 pages
Edition 1st Edition
Languages
Arrow right icon
Authors (2):
Arrow left icon
Mona M Mona M
Author Profile Icon Mona M
Mona M
Premkumar Rangarajan Premkumar Rangarajan
Author Profile Icon Premkumar Rangarajan
Premkumar Rangarajan
Arrow right icon
View More author details
Toc

Table of Contents (23) Chapters Close

Preface 1. Section 1:Introduction to AWS AI NLP Services
2. Chapter 1: NLP in the Business Context and Introduction to AWS AI Services FREE CHAPTER 3. Chapter 2: Introducing Amazon Textract 4. Chapter 3: Introducing Amazon Comprehend 5. Section 2: Using NLP to Accelerate Business Outcomes
6. Chapter 4: Automating Document Processing Workflows 7. Chapter 5: Creating NLP Search 8. Chapter 6: Using NLP to Improve Customer Service Efficiency 9. Chapter 7: Understanding the Voice of Your Customer Analytics 10. Chapter 8: Leveraging NLP to Monetize Your Media Content 11. Chapter 9: Extracting Metadata from Financial Documents 12. Chapter 10: Reducing Localization Costs with Machine Translation 13. Chapter 11: Using Chatbots for Querying Documents 14. Chapter 12: AI and NLP in Healthcare 15. Section 3: Improving NLP Models in Production
16. Chapter 13: Improving the Accuracy of Document Processing Workflows 17. Chapter 14: Auditing Named Entity Recognition Workflows 18. Chapter 15: Classifying Documents and Setting up Human in the Loop for Active Learning 19. Chapter 16: Improving the Accuracy of PDF Batch Processing 20. Chapter 17: Visualizing Insights from Handwritten Content 21. Chapter 18: Building Secure, Reliable, and Efficient NLP Solutions 22. Other Books You May Enjoy

Preface

Authors are a quirky lot; almost like the weather in London. The sky is overcast, you want to go for a walk in Trafalgar Square, you wear your raincoat, pick up your umbrella just in case, and you think you are ready for anything. But you are woefully unaware of the sinister plan nature has for you. You walk a mile or so, and suddenly, without warning, the sky clears, the sun pours its brightest song upon your face, and lo and behold, you are caught unaware (like a deer in headlights) with your raincoat and umbrella and you are too far from home to go back and get rid of them. This is exactly what happens to the best of us when we set out to write a book. You set out with a clear objective, focus your thoughts, write a fantastic outline, get it approved, and start formulating your chapters, but unbeknown to you, the book has other plans on how it wants to write itself.

When this happens, as in life, there are always choices. You can let the creative stream express itself through your hands onto the pages of the book, or you can resist and follow the preconceived pattern you laid out. There is, of course, also a third choice, which is to follow the overall structure for what you want to convey, but allow creativity to take control when it wants to. This is what we did for this book. But it was not as easy as we thought at first, because creativity doesn't take no for an answer. The famous Sufi poet Jalaluddin Rumi said: "In silence, there is eloquence. Stop weaving and see how the pattern improves." The most difficult part was to stop "weaving" or to stop being inspired by the content that we had already published as AWS authors. This was also a hard requirement for the book, and so it was a strong motivation for us to be creative and come up with original, in-demand, and fresh content for the book.

So, we stopped "weaving." The next logical step was for the pattern to improve. But nothing happened. The deadline for the first chapter was looming, and our editors were very politely reminding us of the due date. Still nada. We used this "no weaving" time to storyboard and architect the technical chapters, but the glue that was to hold together the book, the main narrative, continued to elude us. And then suddenly, one day, without warning it struck. We had totally missed the important first part of Rumi's saying: "In silence, there is eloquence." A walk in nature at a trail nearby took care of the daily quota of silence, during which time a faint thought appeared, a memory of a story that my father (Shri T. Rangarajan) had narrated to me when I was a kid called Ali Baba and the Forty Thieves. It dawned on me that the famous sequence from the story was in fact my first recollection of using voice to perform a task (please refer to Chapter 1, NLP in the Business Context and Introduction to AWS AI Services, in the book). And from then on, the floodgates opened. They never stopped until the book was written in its entirety. And that is how this book came about.

An interesting fact about life we all know is that change is the only constant thing. And this was true when writing this book as well. One of the best things about AWS is the pace of innovation with which new features are introduced. The AWS product roadmap is based on direct customer feedback and features are improved iteratively with new features launched continuously. So, as we were writing this book, Amazon Comprehend and Amazon Textract added new features, the console experience was changed, and so on. For example, Amazon Comprehend modified its console experience, added support for custom entity recognition training from PDF documents directly, and improved its custom entity recognition model framework to support training with just 100 annotations per entity and 250 documents. Amazon Textract reduced pricing by 32% for the AnalyzeDocument and DetectDocumentText APIs in eight global AWS Regions, announced support for the automated processing of invoices, and so on. A full list of what's new in AWS in 2021 can be reviewed at this link: https://aws.amazon.com/about-aws/whats-new/2021/.

You will notice these changes as you build the solutions for the various NLP use cases in this book. Please note that since the Amazon Textract and Amazon Comprehend consoles have changed, the instructions in the book may not be a word-for-word match with your experience in the AWS Management Console; however, they are accurate and adequate for your needs.

For example, the Train Recognizer button in the Amazon Comprehend console for custom entity recognition has now changed to Create new model. Similarly, Train Classifier in the Amazon Comprehend console for custom classification has now also changed to Create new model. When you specify Training and test dataset for custom entity recognition, a new option will now appear in the console for selecting PDF, Word documents. Amazon Textract has changed and it now reflects AnalyzeExpense as an option to view the results for your document in the console.

In the majority of the book however we have used APIs to build the solutions and the best thing about AWS is that the APIs do not change. You get consistent responses and requests. You just need to upgrade the version of Python Boto3 if you want to use the latest one. Moreover, our goal is to make sure this book remains relevant and up to date.

lock icon The rest of the chapter is locked
Next Section arrow right
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime