A summary of convolution operations
In this section we present a summary of different convolution operations. A convolutional layer has I input channels and produces O output channels. I × O × K parameters are used, where K is the number of values in the kernel.
Basic convolutional neural networks (CNN or ConvNet)
Let's remind ourselves briefly what a CNN is. CNNs take in an input image (two dimensions) or a text (two dimensions) or a video (three dimensions) and apply multiple filters to the input. Each filter is a like a flashlight sliding across the areas of the input and the areas that it is shining over is called the receptive field. Each filter is a tensor of the same depth of the input (for instance if the image has a depth of 3, then the filter must also have a depth of 3).
When the filter is sliding, or convolving, around the input image, the values in the filter are multiplied by the values of the input. The multiplications are then summarized into...