When we connect convolutional layers, a hyperparameter known as the receptive field or filter size prevents us from having to connect the unit to the entire input, but rather focuses on learning a particular feature. Our convolutional layers typically learn features from simple to complex. The first layer typically learns low-level features, the next layer learns mid-level features, and the last convolutional layer learns high-level features. One of the beautiful features of this is that we do not explicitly tell the network to learn different features at these various levels; it learns to differentiate its task in this manner through the training process:
As we pass through this process, our network will develop a two-dimensional activation map to track the response of that particular filter at a given position. The network will learn to keep filters that...