Data augmentation can take multiple forms, and several options should be considered when performing this procedure. First of all, data augmentation can be done either offline or online. Offline augmentation means transforming all the images before the training even starts, and saving the various versions for later use. Online means applying the transformations when generating each new batch inside the training input pipelines.
Since augmentation operations can be computationally heavy, applying them beforehand and storing the results can be advantageous in terms of latency for the input pipelines. However, this implies having enough memory space to store the augmented dataset, often limiting the number of different versions generated. By randomly transforming the images on the fly, online solutions can provide different looking versions for every epoch. While computationally more expensive, this means presenting more variation to the networks. The choice between offline...