Regularizing a CNN with vanilla NN methods
Since CNNs are a special kind of NNs, most vanilla NN optimization methods can be applied to them. A non-exhaustive list of regularization techniques we can use with CNNs is the following:
- Kernel size
- Pooling size
- L2 regularization
- A fully connected number of units (if any)
- Dropout
- Batch normalization
In this recipe, we will apply batch normalization to add regularization, reusing the LeNet-5 model on the CIFAR-10 dataset, but any other method may work as well.
Batch normalization is a simple yet very effective method that can help NNs regularize and converge faster. The idea of batch normalization is to normalize the activation values of a hidden layer for a given batch. The method is very similar to a standard scaler for data preparation of quantitative data, but there are some differences. Let’s have a look at how it works.
The first step is to compute the mean value µ and the standard...