The fraud detection problem is not a supervised learning problem. We have an unbalanced class situation in our fraud detection scenario. What do we have to say about the importance of the F1 score in relation to the target variable? First, the target variable is a binary label. The F1 score is relevant to our fraud detection problem because we have an unbalanced class, where one class is practically more important than the other. What do we mean by that? The bottom line of the fraud detection classification process concerns whether a certain instance is fraudulent, and getting the classifier to classify or label this instance correctly as fraudulent. The emphasis is not on labeling an instance as non-fraudulent.
To reiterate, there are two classes in our fraud detection problem:
- Fraudulent
- Non-fraudulent...