Cases where your classes are neatly balanced are more of an exception than the rule. In most of the interesting problems we'll come across, the classes are extremely imbalanced. Luckily, a small fraction of online payments are fraudulent, just like a small fraction of the population catch rare diseases. Conversely, few contestants win the lottery and fewer of your acquaintances become your close friends. That's why we are usually interested in capturing those rare cases.
In this chapter, we will learn how to deal with imbalanced classes. We will start by giving different weights to our training samples to mitigate the class imbalance problem. Afterward, we will learn about other techniques, such as undersampling and oversampling. We will see the effect of these techniques in practice. We will also learn how to combine concepts such as ensemble learning...