Defining the model of the neural network
Now we want to write our neural network, which we can call our model, and train it. We know that it should use convolutions, but we don't know much more than that. Let's take inspiration from an old but very influential CNN: LeNet.
LeNet
LeNet was one of the first CNNs. Dating back to 1998, it's pretty small and simple for today's standards. But it is enough for this task.
This is its architecture:
LeNet accepts 32x32 images and has the following layers:
- The first layer is composed of six 5x5 convolutions, emitting images of 28x28 pixels.
- The second layer subsamples the image (for example, computing the average of four pixels), emitting images of 14x14 pixels.
- The third layer is composed of 16 5x5 convolutions, emitting images of 10x10 pixels.
- The fourth layer subsamples the image (for example, computing the average of four pixels), emitting...