Data augmentation
Data augmentation is a simple yet powerful tool to mitigate overfitting problems, particularly when limited real data is available. Data augmentation techniques aim to leverage domain knowledge to enrich the available training data. Thus, data augmentation is usually applied only to the training data and not to validation or test data. For example, assume you are training a face recognition algorithm and you have only 10 images per person. We can simply double the number of these training samples if we horizontally flip the images. Furthermore, we can enhance the diversity of our training data by applying various transformations, such as shifting, scaling, and rotating, using random variables. Instead of using fixed values for these transformations, we can leverage a random number generator to generate new values for each training epoch. Thus, the ML model will be exposed to new variations of our training data at each training epoch. This simple data augmentation...