Classifying data with the k-nearest neighbor classifier
K-nearest neighbor (knn) is a nonparametric lazy learning method. From a nonparametric view, it does not make any assumptions about data distribution. In terms of lazy learning, it does not require an explicit learning phase for generalization. The following recipe will introduce how to apply the k-nearest neighbor algorithm on the churn dataset.
Getting ready
You need to have the previous recipe completed by generating the training and testing datasets.
How to do it...
Perform the following steps to classify the churn data with the k-nearest neighbor algorithm:
- First, one has to install the
class
package and have it loaded in an R session:
> install.packages("class")> library(class)
- Replace
yes
andno
of thevoice_mail_plan
andinternational_plan
attributes in both the training dataset and testing dataset to 1 and 0:
> levels(trainset$international_plan) = list("0"="no", "1"="yes") > levels(trainset$voice_mail_plan...