Download the code and data
In this chapter, we'll make use of data on film recommendations from the website https://movielens.org/. The site is run by GroupLens, a research lab in the Department of Computer Science and Engineering at the University of Minnesota, Twin Cities.
Datasets have been made available in several different sizes at https://grouplens.org/datasets/movielens/. In this chapter, we'll be making use of "MovieLens 100k"—a collection of 100,000 ratings from 1,000 users on 1,700 movies. As the data was released in 1998, it's beginning to show its age, but it provides a modest dataset on which we can demonstrate the principles of recommender systems. This chapter will give you the tools you need to process the more recently released "MovieLens 20M" data: 20 million ratings by 138,000 users on 27,000 movies.
Note
The code for this chapter is available from the Packt Publishing's website or from https://github.com/clojuredatascience...