Analyzing movie review sentiment with RNNs
So, here comes our first RNN project: movie review sentiment. We’ll use the IMDb (https://www.imdb.com/) movie review dataset (https://ai.stanford.edu/~amaas/data/sentiment/) as an example. It contains 25,000 highly popular movie reviews for training and another 25,000 for testing. Each review is labeled as 1 (positive) or 0 (negative). We’ll build our RNN-based movie sentiment classifier in the following three sections: Analyzing and preprocessing the movie review data, Developing a simple LSTM network, and Boosting the performance with multiple LSTM layers.
Analyzing and preprocessing the data
We’ll start with data analysis and preprocessing, as follows:
- PyTorch’s
torchtext
has a built-in IMDb dataset, so first, we load the dataset:>>> from torchtext.datasets import IMDB >>> train_dataset = list(IMDB(split='train')) >>> test_dataset = list(IMDB...