Optimizing with batch normalization
Another well-known optimization for CNNs is batch normalization. This technique normalizes the inputs of the current batch before feeding it to the next layer; therefore, the mean activation for each batch is around zero and the standard deviation around one, and we can avoid internal covariate shift. By doing this, the input distribution of the data per batch has less effect on the network, and as a consequence the model is able to generalize better and train faster.Â
In the following recipe, we'll show you how to apply batch normalization to an image dataset with 10 classes (CIFAR-10). First, we train the network architecture without batch normalization to demonstrate the difference in performance.
How to do it...
- Import all necessary libraries:
import numpy as np from matplotlib import pyplot as plt from keras.utils import np_utils from keras.models import Sequential from keras.layers.core import Dense, Dropout, Activation, Flatten from keras.callbacks...