As a supervised learning task, classification is the problem of identifying which set of observations (sample) belongs to what based on one or more independent variables. This learning process is based on a training set containing observations (or instances) about the class or label of membership. Typically, classification problems are when we are training a model to predict quantitative (but discrete) targets, such as spam detection, churn prediction, sentiment analysis, cancer type prediction, and so on.
Suppose we want to develop a predictive model, which will predict whether a student is competent enough to get admission into computer science based on his/her competency in TOEFL and GRE. Also, suppose we have some historical data in the following range/format:
- TOEFL: Between 0 and 100
- GRE: Between 0 and 100
- Admission: 1 for admitted, 0 if not admitted...