Model-based collaborative filtering
This is currently one of the most advanced approaches and is an extension of what was already seen in the previous section. The starting point is always a rating-based user-item matrix:
However, in this case, we assume the presence of latent factors for both the users and the items. In other words, we define a generic user as:
A generic item is defined as:
We don't know the value of each vector component (for this reason they are called latent), but we assume that a ranking is obtained as:
So we can say that a ranking is obtained from a latent space of rank k, where k is the number of latent variables we want to consider in our model. In general, there are rules to determine the right value for k, so the best approach is to check different values and test the model with a subset of known ratings. However, there's still a big problem to solve: finding the latent variables. There are several strategies, but before discussing them, it's important to understand...