Implementing a LeNet-5 step by step
In this section, we will learn how to build a LeNet-5 architecture to classify images in the MNIST dataset. The next figure shows how the data flows in the first two convolutional layers: the input image is processed in the first convolutional layer using the filter weights. This results in 32 new images, one for each filter in the convolutional layer. The images are also down-sampled with the pooling operation, so the image resolution is decreased from 28×28 to 14×14. These 32 smaller images are then processed in the second convolutional layer. We need filter weights again for each of these 32 images and we need filter weights for each output channel of this layer. The images are again down-sampled with a pooling operation, so that the image resolution is decreased from 14×14 to 7×7. The total number of features for this convolutional layer is 64.
The 64 resulting images are...