Background of a CNN
CNN, a particular form of deep learning models, is not a new concept, and they have been widely adopted by the vision community for a long time. The model worked well in recognizing the hand-written digit by LeCun et al in 1998 [90]. But unfortunately, due to the inability of CNNs to work with higher resolution images, its popularity has diminished with the course of time. The reason was mostly due to hardware and memory constraints, and also the lack of availability of large-scale training datasets. As the computational power increases with time, mostly due to the wide availability of CPUs and GPUs and with the generation of big data, various large-scale datasets, such as the MIT Places dataset (see Zhou et al., 2014), ImageNet [91] and so on. it became possible to train larger and complex models. This is initially shown by Krizhevsky et al [4] in their paper, Imagenet classification using deep convolutional neural networks. In that paper, they brought down...