Naive Bayes with H2O on Hadoop with R
The growing number of machine learning applications in data science has led to the development of several Big Data predictive analytics tools as described in the first part of this chapter. It is even more exciting for R users that some of these tools connect well with the R language allowing data analysts to use R to deploy and evaluate machine learning algorithms on massive datasets. One such Big Data machine learning platform is H2O- open-source, hugely scalable, and fast data exploratory and machine learning software developed and maintained by California-based start-up H2O.ai (formerly known as 0xdata). As H2O is designed to effortlessly integrate with cloud computing platforms such as Amazon EC2 or Microsoft Azure, it has become the obvious choice for large businesses and organisations wanting to implement powerful machine and statistical learning models on massively scalable in-house or cloud-based infrastructures.