Understanding the Architecture of a CNN
Let's assume we have the task of classifying each of the MNIST images as a number between 0 and 9. The input in the previous example is an image matrix. For a colored image, each pixel is an array with three values corresponding to the RGB color scheme. For grayscale images, each pixel is just one number, as we saw earlier.
To understand the architecture of a CNN, it is best to separate it into two sections as visualized in the image that follows.
A forward pass of the CNN involves a set of operations in the two sections.
Figure 4.4: Application of convolution and ReLU operations
The figure is explained in the following sections:
- Feature extraction
- Neural network
Feature Extraction
The first section of a CNN is all about feature extraction. Conceptually, it can be interpreted as the model's attempt to learn which features distinguish one class from another. In the task of classifying images, these features might...