Removing examples uniformly
There are two major ways of removing the majority class examples uniformly from the data. The first way is to remove the examples randomly, and the other way involves using clustering techniques. Let’s discuss both of these methods in detail.
Random UnderSampling
The first technique the king might think of is to pick Capulets randomly and remove them from the town. This is a naïve approach. It might work, and the king might be able to bring peace to the town. But the king might cause unforeseen damage by picking up some influential Capulets. However, it is an excellent place to start our discussion. This technique can be considered a close cousin of random oversampling. In Random UnderSampling (RUS), as the name suggests, we randomly extract observations from the majority class until the classes are balanced. This technique inevitably leads to data loss, might harm the underlying structure of the data, and thus performs poorly sometimes...