Building a movie recommender with Naïve Bayes
After the toy example, it is now time to build a movie recommender (or, more specifically, movie preference classifier) using a real dataset. We herein use a movie rating dataset (https://grouplens.org/datasets/movielens/). The movie rating data was collected by the GroupLens Research group from the MovieLens website (http://movielens.org).
For demonstration purposes, we will use the stable small dataset, MovieLens 1M Dataset (which can be downloaded from https://files.grouplens.org/datasets/movielens/ml-1m.zip or https://grouplens.org/datasets/movielens/1m/) for ml-1m.zip
(size: 1 MB) file). It has around 1 million ratings, ranging from 1 to 5 with half-star increments, given by 6,040 users on 3,706 movies (last updated September 2018).
Unzip the ml-1m.zip
file and you will see the following four files:
movies.dat
: It contains the movie information in the format ofMovieID::Title::Genres
.ratings.dat
: It...