Building a movie recommender with Naïve Bayes
After the toy example, it is now time to build a movie recommender (or, more specifically, movie preference classifier) using a real dataset. We herein use a movie rating dataset ( The movie rating data was collected by the GroupLens Research group from the MovieLens website (
For demonstration purposes, we will use the small dataset, ml-latest-small (downloaded from the following link: of (size: 1 MB)) as an example. It has around 100,00 ratings, ranging from 1 to 5, given by 6,040 users on 3,706 movies (last updated September 2018).
Unzip the
file and you will see the following four files:
: It contains the movie information in the format of MovieID::Title::Genres.ratings.dat
: It contains user movie ratings in the format of UserID...