Exercises
- Explore the various undersampling APIs available from the
imbalanced-learn
library at https://imbalanced-learn.org/stable/references/under_sampling.html. - Explore the
NearMiss
undersampling technique, available through theimblearn.under_sampling.NearMiss
API. Which class of methods does it belong to? Apply theNearMiss
method to the dataset that we used in the chapter. - Try all the undersampling methods discussed in this chapter on the
us_crime
dataset from UCI. You can find this dataset in thefetch_datasets
API of theimbalanced-learn
library. Find the undersampling method with the highestf1-score
metric forLogisticRegression
andXGBoost
models. - Can you identify an undersampling method of your own? (Hint: think about combining the various approaches to undersampling in new ways.)