CNNs are widely used in the area of computer vision and they outperform most of the traditional computer vision techniques that we have been using. CNNs combine the famous convolution operation and neural networks, hence the name convolutional neural network. So, before diving into the neural network aspect of CNNs, we are going to introduce the convolution operation and see how it works.
The main purpose of the convolution operation is to extract information or features from an image. Any image could be considered as a matrix of values and a specific group of values in this matrix will form a feature. The purpose of the convolution operation is to scan this matrix and try to extract relevant or explanatory features for that image. For example, consider a 5 by 5 image whose corresponding intensity or pixel values are shown as zeros and ones: