Reviewing methods to prevent overfitting in CNNs
Overfitting occurs when the model fits too well to the training set but is not able to generalize to unseen cases. For example, a CNN model recognizes specific traffic sign images in the training set instead of general patterns. It can be very dangerous if a self-driving car is not able to recognize sign images in ever-changing conditions, such as different weather, lighting, and angles different from what are presented in the training set. To recap, here's what we can do to reduce overfitting:
- Collecting more training data (if possible and feasible) in order to account for various input data.
- Using data augmentation, wherein we invent data in a smart way if time or cost does not allow us to collect more data.
- Employing dropout, which diminishes complex co-adaptations among neighboring neurons.
- Adding Lasso (L1) or/and Ridge (L2) penalty, which prevents model coefficients from fitting so perfectly that overfitting arises.
- Reducing the complexity...