References
During the writing of this chapter, I came across many useful and relevant references. I have listed them here:
- The GoodbyMapReduce article from Mahout News (https://mahout.apache.org/)
- https://spark.apache.org/docs/latest/mllib-guide.html
- The Collaborative Filtering ALS paper (http://dl.acm.org/citation.cfm?id=1608614)
- Good presentation on decision trees (http://spark-summit.org/wp-content/uploads/2014/07/Scalable-Distributed-Decision-Trees-in-Spark-Made-Das-Sparks-Talwalkar.pdf)
- Recommendation hands-on exercise from Spark Summit 2014 (https://databricks-training.s3.amazonaws.com/movie-recommendation-with-mllib.html)
- https://amplab.cs.berkeley.edu/ml-pipelines/
- https://databricks.com/blog/2015/01/07/ml-pipelines-a-new-high-level-api-for-mllib.html
- ML Pipeline Design Doc https://docs.google.com/document/d/1rVwXRjWKfIb-7PI6b86ipytwbUH7irSNLF1_6dLmh8o/
- Examples at https://github.com/apache/spark/tree/master/examples/src/main/scala/org/apache/spark/examples/ml