6. LSTMs, GRUs, and Advanced RNNs
Activity 6.01: Sentiment Analysis of Amazon Product Reviews
Solution
- Read in the data files for the
train
andtest
sets. Examine the shapes of the datasets and print out the top5
records from thetrain
data:import pandas as pd, numpy as np import matplotlib.pyplot as plt %matplotlib inline train_df = pd.read_csv("Amazon_reviews_train.csv") test_df = pd.read_csv("Amazon_reviews_test.csv") print(train_df.shape, train_df.shape) train_df.head(5)
The dataset's shape and header are as follows:
- For convenience, when it comes to processing, separate the raw text and the labels for the
train
andtest
sets. You should have4
variables, as follows:train_raw
comprising raw text for the train data,train_labels
with labels for the train data,test_raw
containing raw text for the test data, andtest_labels
comprising Labels for the test data. Print the first two reviews...