An alternative to neighborhoods is to formulate recommendations as a regression problem and apply the methods that we learned in the Chapter 6, Clustering - Finding Related Posts.
We first consider why this problem is not a good fit for a classification formulation. We could certainly attempt to learn a five-class model, using one class for each possible movie rating. However, there are two problems with this approach:
- The different possible errors are not at all the same. For example, mistaking a 5-star movie for a 4-star one is not as serious a mistake as mistaking a 5-star movie for a 1-star one
- Intermediate values make sense. Even if our inputs are only integer values, it is perfectly meaningful to say that the prediction is 4.3. We can see that this is a different prediction than 3.5, even if they both round to 4
These two factors...