Key concepts of KNN
KNN might be the most intuitive algorithm that we will discuss in this book. The idea is to find k instances whose attributes are most similar, where that similarity matters for the target. That last clause is an important, though perhaps obvious, qualification. We care about similarity among those attributes associated with the target’s value.
For each observation where we need to predict the target, KNN finds the k training observations whose features are most similar to those of that observation. When the target is categorical, KNN selects the most frequent value of the target for the k training observations. (We often select an odd value for k for classification problems to avoid ties.)
By training observations, I mean those observations that have known target values. No real training is done with KNN since it’s a lazy learner. I will discuss that in more detail in this section.
The following diagram illustrates the use of KNN for classification...