Summary
The transition of methods to handle data imbalance from classical machine learning models to deep learning models can pose unique challenges, primarily due to the distinct types of data that these models have to work with. Classical machine learning models typically deal with structured, tabular data, whereas deep learning models often grapple with unstructured data, such as images, text, audio, and video. This chapter explored how to adapt sampling techniques to work with deep learning models. To facilitate this, we used an imbalanced version of the MNIST dataset to train a model, which is then employed in conjunction with various oversampling methods.
Incorporating random oversampling with deep learning models involves duplicating samples from minority classes randomly, until each class has an equal number of samples. This is usually performed using APIs from libraries such as imbalanced-learn, Keras, TensorFlow, or PyTorch, which work together seamlessly for this purpose...