Class-Imbalanced Data
Consider the scenario we discussed at the beginning of the chapter about the online shopping company. Imagine that out of the four shortlisted sellers, one is a very well-known company. In such a situation, there is a high chance of this company getting most of the orders as compared to the rest of the three sellers. If the online shopping company decided to divert all the customers to this seller, for a large number of customers, it would actually end up matching their preference. This is a classic scenario of class imbalance since one class is dominating the rest of the classes in terms of data points. Class imbalance is also seen in fraud detection, anti-money laundering, spam detection, cancer detection, and many other situations.
Before you go into the details about how to deal with class imbalance, let's first see how it can pose a big problem in a marketing analyst's work in the following exercise.