The convolutional layer is the most important building block of a CNN. It consists of a set of filters (also known as kernels or feature detectors), where each filter is applied across all areas of the input data. A filter is defined by a set of learnable weights. As a nod to the topic at hand, the following image illustrates this very well:
Displayed is a two-dimensional input layer of a neural network. For the sake of simplicity, we'll assume that this is the input layer, but it can be any layer of the network. As we've seen in the previous chapters, each input neuron represents the color intensity of a pixel (we'll assume it's a grayscale image for simplicity). First, we'll apply a 3x3 filter in the top-right corner of the image. Each input neuron is associated with a single weight of the filter. It has nine weights, because of...