CNNs
CNNs are the cornerstone of image classification in deep learning. This section gives an introduction to them, explains the history of CNNs, and will explain why they are so powerful.
Before we begin, we will look at a simple deep learning architecture. Deep learning models are difficult to train, so using an existing architecture is often the best place to start. An architecture is an existing deep learning model that was state-of-the-art when initially released. Some examples are AlexNet, VGGNet, GoogleNet, and so on. The architecture we will look at is the original LeNet architecture for digit classification from Yann LeCun and others from the mid 1990s. This architecture was used for the MNIST
dataset. This dataset is comprised of grayscale images of 28 x 28 size that contain the digits 0 to 9. The following diagram shows the LeNet architecture:
Figure 5.1: The LeNet architectureÂ
The original images are 28 x 28 in size. We have a series of hidden layers which are convolution and pooling...