Recommendation system in Spark
We are now going to move ahead with the practical example of building the recommendation system with Spark. Since most users are familiar with movies, we are going to use the Movie Lens data set for building a recommendation system, have a look at the data, and look at some of the options. The theory behind recommendation systems and this practical example should give you a good starting point in building one.
Sample dataset
We are going to use the MovieLens 100k dataset, which at the time of writing was last updated in October 2016. This dataset (ml-latest-small) describes 5-star rating and free-text tagging activity from MovieLens (https://movielens.org/), a movie recommendation service. It contains 1,00,004 ratings and 1,296 tag applications across 9,125 movies. This data was created by 671 users between January 09, 1995 and October 16, 2016. This dataset was generated on October 17, 2016 and it can be found at http://bit.ly/24PV0hK. Further details...