Let's consider the following dataset:

We define affinity, a metric function of two arguments with the same dimensionality, m. The most common metrics (also supported by scikit-learn) are the following:
- Euclidean or L2 (Minkowski distance with p=2):

- Manhattan (also known as city block) or L1 (Minkowski distance with p=1):

- Cosine distance:

The Euclidean distance is normally a good choice, but sometimes it's useful to have a metric whose difference from the Euclidean one gets larger and larger. As discussed in Chapter 9, Clustering Fundamentals, the Manhattan metric has this property. In the following graph, there's a plot representing the distances from the origin of points belonging to the line y = x:

Distances of the point (x, x) from (0, 0) using the Euclidean and Manhattan metrics
The cosine distance is instead useful when we...