Introduction
H2O is a fast, scalable, open-source machine learning and deep learning library for smarter applications. Using in-memory compression, H2O handles billions of data rows in memory, even with a small cluster. In order to create complete analytic workflows, H2O's platform includes interfaces for R, Python, Scala, Java, JSON and CoffeeScript/JavaScript flows, as well as a built-in web interface. H2O is designed to run in standalone mode on Hadoop, or within a Spark Cluster. It includes many common machine learning algorithms, such as generalized linear modeling (linear regression, logistic regression, and so on), Naive Bayes, principal components analysis, k-means clustering and others.
H2O also implements best-in-class algorithms at scale, such as distributed random forest, gradient boosting and deep learning. Users can build thousands of models and compare the results to get the best predictions.
Sparkling Water allows users to combine the fast, scalable machine learning algorithms...