Let's consider the following dataset:
![](https://static.packt-cdn.com/products/9781789347999/graphics/assets/0a05b46c-bef8-46ad-8f03-079c387b7ddb.png)
We define affinity, a metric function of two arguments with the same dimensionality, m. The most common metrics (also supported by scikit-learn) are the following:
- Euclidean or L2 (Minkowski distance with p=2):
![](https://static.packt-cdn.com/products/9781789347999/graphics/assets/5afba783-e264-40f4-95a0-839e89a7a63a.png)
- Manhattan (also known as city block) or L1 (Minkowski distance with p=1):
![](https://static.packt-cdn.com/products/9781789347999/graphics/assets/c3837005-fafa-4982-bc32-7f5393aa1e61.png)
- Cosine distance:
![](https://static.packt-cdn.com/products/9781789347999/graphics/assets/0ec1cabd-5231-4c43-b595-69a3bfefecff.png)
The Euclidean distance is normally a good choice, but sometimes it's useful to have a metric whose difference from the Euclidean one gets larger and larger. As discussed in Chapter 9, Clustering Fundamentals, the Manhattan metric has this property. In the following graph, there's a plot representing the distances from the origin of points belonging to the line y = x:
![](https://static.packt-cdn.com/products/9781789347999/graphics/assets/c17f9082-d708-4cd9-a030-2f23d45c7fac.png)
Distances of the point (x, x) from (0, 0) using the Euclidean and Manhattan metrics
The cosine distance is instead useful when we...