This method is intrinsically one of the simplest algorithms, belonging to the family of instance-based learning methods. Such a general approach is not based on a parameterized model that must be fit, for example, in order to maximize the likelihood. Conversely, instance-based algorithms rely completely on the data and their underlying structure. In particular, k-NN is a technique that can be employed for different purposes (even if we are going to consider it as a clustering algorithm), and it's based on the idea that samples that are close with respect to a predefined distance metric are also similar, so they can share their peculiar features. More formally, let's consider a dataset:

In order to measure the similarity, we need to introduce a distance function. The most common choice is the Minkowski metric, which is defined as follows:

p = 1, d1(•) becomes...