Apache Mahout
Apache Mahout is a scalable machine learning library. It is an open source library under the Apache Software Foundation. It supports algorithms for clustering, classification, and collaborative filtering on distributed platforms. Apache Mahout welcomes contributors to contribute any algorithm to the library. The algorithm coded may not always be distributed and can run on a single machine as well.
Tip
As Apache Mahout allows developers to introduce single-machine algorithms, it is recommended that you study the implementation before running it on Hadoop.
Apache Mahout has a few algorithms that are implemented as MapReduce. These algorithms can be run in Hadoop to exploit the parallelism on a distributed cluster. Again, a word of caution for you is to study the implementation of an algorithm before using it in your Hadoop deployments. A non-MapReduce algorithm may not yield any speedup when run on a Hadoop cluster.
Tip
In a recent change, since April 2014, Mahout has stopped accepting...