Regularizing with the maximum sequence length
In this recipe, we will regularize by playing with the maximum sequence length, on the IMDB dataset, using a GRU-based neural network.
Getting ready
Up to now, we have not played much with the maximum length of the sequence, but it is sometimes one of the most important hyperparameters to tune.
Indeed, depending on the input dataset, the optimal maximum length can be quite different:
- A tweet is short, so having a maximum number of tokens of hundreds does not make sense most of the time
- A product or movie review can be significantly longer, and sometimes, the reviewer writes a lot of pros and cons about the product/movie, before getting to the final conclusion – in such cases, a larger maximum length may help
In this recipe, we will train a GRU on the IMDb dataset, containing movie reviews and associated labels (either positive or negative); this dataset contains some very lengthy texts. So, we will significantly...