The concept of overfitting
So far, we’ve seen that the accuracy of the training dataset is typically more than 95%, while the accuracy of the validation dataset is ~89%. Essentially, this indicates that a model does not generalize as much on unseen datasets, since it can learn from the training dataset. This also indicates that the model learns all the possible edge cases for the training dataset; these can’t be applied to the validation dataset.
Having high accuracy on the training dataset and considerably lower accuracy on the validation dataset refers to the scenario of overfitting.
Some of the typical strategies that are employed to reduce the effect of overfitting are dropout and regularization. We will look at what impact they have on training and validation losses in the following sections.
Impact of adding dropout
We have already learned that whenever loss.backward()
is calculated, a weight update happens. Typically, we would have...