Computing similarity scores
To build a recommendation system, it is important to understand how to compare various objects in the dataset. If the dataset consists of people and their various movie preferences, then in order to make a recommendation we need to understand how to compare any two people with one another. This is where the similarity score is important. The similarity score gives an idea of how similar two data points are.
There are two scores that are used frequently in this domain – the Euclidean score and the Pearson score. The Euclidean score uses the Euclidean distance between two data points to compute the score. If you need a quick refresher on how Euclidean distance is computed, you can go to:
https://en.wikipedia.org/wiki/Euclidean_distance
The value of the Euclidean distance can be unbounded. Hence, we take this value and convert it in a way that the Euclidean score ranges from 0 to 1. If the Euclidean distance between two objects is large...