Performing text analysis
The field of Natural Language Processing (NLP) is used for many different tasks including text searching, language translation, sentiment analysis, speech recognition, and classification to mention a few. Processing text is difficult due to a number of reasons, including the inherent ambiguity of natural languages.
Â
There are several different types of processing that can be performed such as:
- Identifying Stop words: These are words that are common and may not be necessary for processing
- Name Entity Recognition (NER): This is the process of identifying elements of text such as people's names, location, or things
- Parts of Speech (POS): This identifies the grammatical parts of a sentence such as noun, verb, adjective, and so on
- Relationships: Here we are concerned with identifying how parts of text are related to each other, such as the subject and object of a sentence
As with most data science problems, it is important to preprocess and clean text. In Chapter 9, Text Analysis...