A very common issue when working with classification tasks is that of class imbalance: when one class is highly outnumbered in comparison to the second one (this can also be extended to multi-class). In general, we are talking about imbalance when the ratio of the two classes is not 1:1. In some cases, a delicate imbalance is not that big of a problem, but there are industries/problems in which we can encounter ratios of 100:1, 1000:1, or even worse.
In this recipe, we show an example of a credit card fraud problem, where the fraudulent class is only 0.17% of the entire sample. In such cases, gathering more data (especially of the fraudulent class) might simply not be feasible, and we need to resort to some techniques that can help us in understanding and avoiding the accuracy paradox.