Natural language processing (NLP) is about analyzing text, articles and involves carrying out predictive analysis on textual data. The algorithm we make will address a simple problem, but the same concept is applicable to any text. We can also predict the genre of a book with NLP.
Consider the following Tab Separated Values (TSV), which is a tab-delimited dataset for us to apply NLP to and see how it works:
This is a small portion of the data we will be working on. In this case, the data represents customer reviews about a restaurant. The reviews are given as text, and they have a rating, which is 0 or 1 to indicate whether the customer liked the restaurant or not. 1 would mean the review is positive and 0 would indicate that it's not positive.
Usually, we would use a CSV file. Here, however, we are using a TSV file where the delimiter is a tab...