Let's use NLP to conduct a movie review sentiment analysis. For this, we will use some open source movie review data available at http://www.cs.cornell.edu/people/pabo/movie-review-data/:
First, we will import the dataset that contains the movie reviews:
import numpy as np
import pandas as pd
Now, let us load the movies' data and print the first few rows to observe its structure.
df=pd.read_csv("moviereviews.tsv",sep='\t')
df.head()
Note that the dataset has 2000 movie reviews. Out of these, half are negative and half are positive.
Now, let's start preparing the dataset for training the model. First, let us drop any missing values that are in the data
df.dropna(inplace=True)
Now we need to remove the whitespaces as well. Whitespaces are not null but need to be removed. For...