Hyperparameters and tuning
Figure 10.4 clearly shows that increasing the number of training epochs is not going to improve performance on this task. The best validation accuracy seems to be about 80% after 10 epochs. However, 80% accuracy is not very good. How can we improve it? Here are some ideas. None of them is guaranteed to work, but it is worth experimenting with them:
- If more training data is available, the amount of training data can be increased.
- Preprocessing techniques that can remove noise from the training data can be investigated—for example, stopword removal, removing non-words such as numbers and HTML tags, stemming and lemmatization, and lowercasing. Details on these techniques were covered in Chapter 5.
- Changes to the learning rate—for example, lowering the learning rate might improve the ability of the network to avoid local minima.
- Decreasing the batch size.
- Changing the number of layers and the number of neurons in each layer...