We've already read about image formatting in computers in Chapter 4, Computer Vision for Self-Driving Cars. Basically, there are three channels red, green, and blue, which are popularly known as RGB. They have their respective pixel values. So, if we will say the size of an image is B x A x 3, this means there are B rows, A columns, and 3 channels. If the image size is 28 x 28 x 3, this means there are 28 rows, 28 columns, and 3 channels.
This is how our computer sees images. For black and white images, there are only two channels.
In the following screenshot, you can see a visual example of a computer viewing an image:
Fig 6.1: Computer viewing an image
In the next section, we will read about why we need CNNs.