Introduction
Classification is used to identify a category of new observations (testing datasets) based on a classification model built from the training dataset of which the categories are already known. Similar to regression, classification is categorized as a supervised learning method as it employs known answers (label) of a training dataset to predict the answer (label) of the testing dataset. The main difference between regression and classification is that regression is used to predict continuous values.
In contrast to this, classification is used to identify the category of a given observation. For example, one may use regression to predict the future price of a given stock based on historical prices. However, one should use the classification method to predict whether the stock price will rise or fall.
In this chapter, we will illustrate how to use R to perform classification. We first build a training dataset and a testing dataset from the churn
dataset, and then apply different...