In supervised machine learning, if the target variable is a category variable, the model is categorized as a classifier:
The target variable is called a label.
The historical data is called labeled data.
The production data, which the label needs to be predicted for, is called unlabeled data.
The ability to accurately label unlabeled data using a trained model is the real power of classification algorithms. Classifiers predict labels for unlabeled data to answer a particular business question.
Before we present the details of classification algorithms, let's first present a business problem that we will use as a challenge for classifiers. We will then use six different algorithms to answer the same challenge, which will help us compare their methodology, approach, and performance.