Implementing a simple CNN
In this recipe, we will develop a CNN based on the LeNet-5 architecture, which was first introduced in 1998 by Yann LeCun et al. for handwritten and machine-printed character recognition.
Figure 8.3: LeNet-5 architecture – Original image published in [LeCun et al., 1998]
This architecture consists of two sets of CNNs composed of convolution-ReLU-max pooling operations used for feature extraction, followed by a flattening layer and two fully connected layers to classify the images.
Our goal will be to improve upon our accuracy in predicting MNIST digits.
Getting ready
To access the MNIST data, Keras provides a package (tf.keras.datasets
) that has excellent dataset-loading functionalities. (Note that TensorFlow also provides its own collection of ready-to-use datasets with the TF Datasets API.) After loading the data, we will set up our model variables, create the model, train the model in batches, and then visualize...