Oversampling in multi-class classification
In multi-class classification problems, we have more than two classes or labels to be predicted, and hence more than one class may be imbalanced. This adds some more complexity to the problem. However, we can apply the same techniques to multi-class classification problems as well. The imbalanced-learn
library provides the option to deal with multi-class classification in almost all the supported methods. We can choose from various sampling strategies using the sampling_strategy
parameter. For multi-class classification, we can pass some fixed string values (called built-in strategies) to the sampling_strategy
parameter in the SMOTE API. We can also pass a dictionary with the following:
- Keys as the class labels
- Values as the number of samples of that class
Here are the built-in strategies for sampling_strategy
when using the parameter as a string:
- The
minority
strategy resamples only the minority class. - The
not...