Amazon Comprehend is a service available in AWS that offers natural language processing (NLP) algorithms. NLP is a field in machine learning that analyzes human (that is, natural) languages and can identify various attributes of these languages. In many of our previous chapters, we looked at examples of structured data; here, the data had predefined features and was organized as rows of observations. However, a natural language dataset is more complicated to process. Such datasets are called unstructured datasets, as the structure of the features is not well defined.
Hence, algorithms are needed to extract structure and information from a text document. For example, natural languages have words that are arranged using a grammatical structure. Natural-language sentences also have keywords that contain more information regarding places, people, and...