Improving the movie-rating system
We don't want to build a recommendation engine with a system that considers the likely straight-to-DVD Santa with Muscles as generally superior to Casablanca. Thus, the naïve scoring approach used previously must be improved upon and is the focus of this recipe.
Getting ready
Make sure that you have completed the previous recipes in this chapter first.
How to do it...
The following steps implement and test a new movie-scoring algorithm:
- Let's implement a new Bayesian movie-scoring algorithm, as shown in the following function, adding it to the
MovieLens
class:
In [11]: def bayesian_average(self, c=59, m=3): ...: """ ...: Reports the Bayesian average with parameters c and m. ...: """ ...: for movieid in self.movies: ...: reviews = list(r['rating'] for r in self.reviews_for_movie(movieid)) ...: average = ((c * m) + sum(reviews)) / float(c + len(reviews)) ...: yield (movieid, average, len(reviews))
- Next, we will replace the
top_rated
method in theMovieLens
class...