Summary
We've covered a lot of ground in this chapter. Although the subject was principally recommender systems, we've also discussed dimensionality reduction and introduced the Spark distributed computation framework as well.
We started by discussing the difference between content- and collaborative filtering-based approaches to the problem of recommendation. Within the context of collaborative filtering, we discussed item-item recommenders and built a Slope One recommender. We also discussed user-user recommenders and used Mahout's implementations of a variety of similarity measures and evaluators to implement and test several user-based recommenders too. The challenge of evaluation provided an opportunity to introduce the statistics of information retrieval.
We spent a lot of time in this chapter covering several different types of dimensionality reduction. For example, we learned about the probabilistic methods offered by Bloom filters and MinHash, and the analytic methods...