Key concepts for K-nearest neighbors regression
Part of the appeal of the KNN algorithm is that it is quite straightforward and easy to interpret. For each observation where we need to predict the target, KNN finds the k training observations whose features are most similar to those of that observation. When the target is categorical, KNN selects the most frequent value of the target for the k training observations. (We often select an odd value for k for classification problems to avoid ties.)
When the target is numeric, KNN gives us the average value of the target for the k training observations. By training observation, I mean those observations that have known target values. No real training is done with KNN, as it is what is called a lazy learner. I will discuss that in more detail later in this section.
Figure 9.1 illustrates using K-nearest neighbors for classification with values of 1 and 3 for k. When k is 1, our new observation will be assigned the red label. When k...