Sequence Models for Text Classification
In Chapter 5, Deep Learning for Sequences, we learned that RNNs perform extremely well on sequence-modeling tasks and provide high performance on text-related tasks. In this chapter, we will use plain RNNs and variants of RNNs on a sentiment classification task: processing the input sequence and predicting whether the sentiment is positive or negative.
We'll use the IMDb reviews dataset for this task. The dataset contains 50,000 movie reviews, along with their sentiment – 25,000 highly polar movie reviews for training and 25,000 for testing. A few reasons for using this dataset are as follows:
- It is very conveniently available to load Keras (tokenized version) with a single command.
- The dataset is commonly used for testing new approaches/models. This should help you compare your results with other approaches easily.
- Longer sequences in the data (IMDb reviews can get very long) help us assess the differences between...