Cost-Sensitive Learning using scikit-learn and XGBoost models
scikit-learn
provides a class_weight
hyperparameter to adjust the weights of various classes for most models. This parameter can be specified in various ways for different learning algorithms in scikit-learn
. However, the main idea is that this parameter specifies the weights to use for each class in the loss calculation formula. For example, this parameter specifies the values of weight FP and weight FN mentioned previously for logistic regression.
Similar to the LogisticRegression
function, for DecisionTreeClassifier
, we could use DecisionTreeClassifier(class_weight='balanced')
or DecisionTreeClassifier(class_weight={0: 0.5,
1: 0.5})
.
Regarding SVM, it can even be extended to multi-class classification by specifying a weight value for each class label:
svm.SVC(class_weight= {-1: 1.0, 0: 1.0, 1: 1.0})
The general guidance about coming up with the class_weight
values is to use the inverse of...