Training a CNN
In this recipe, after reviewing the fundamental components of CNN, we will train one on a classification task – the CIFAR10 dataset.
Getting started
Computer vision is a special field for many reasons. The data handled in computer vision projects is usually rather large, multidimensional, and unstructured. However, its most specific aspect is arguably its spatial structure.
With its spatial structure comes a lot of potential difficulties, such as the following:
- Aspect ratio: Some images come with different aspect ratios depending on their source, such as 16/9, 4/3, 1/1, and 9/16
- Occlusion: An object can be occluded by another one
- Deformation: An object can be deformed, either because of perspective or physical deformation
- Point of view: Depending on the point of view, an object can look totally different
- Illumination: A picture can be taken in many light environments that may alter the image
Many of these difficulties are...