As explained in Chapter 3, Modern Neural Networks, input images have to be preprocessed. The most common preprocessing method is to divide each channel by 127.5 (127.5 = 255/2 = middle value of an image pixel) and subtract 1. This way, we represent images with values between -1 and 1:
Figure 9-6: Example of preprocessing for a 3 x 3 image with a single channel
However, there are many ways to represent images, depending on the following:
- The order of the channels: RGB or BGR
- Whether the image is between 0 and 1, -1 and 1, or 0 and 255
- The order of the dimensions: [W, H, C] or [C, W, H]
- The orientation of the image
When porting a model, it is paramount to use the exact same preprocessing on a device as during training. Failing to do so will lead the model to infer poorly, sometimes even to fail completely, as the input data will be too different compared with the training data.
All mobile deep learning frameworks provide some options to specify preprocessing...