Section 2 – Building State-of-the-Art Models on Large Data Volumes Using H2O
This section dives deep into advanced techniques to build accurate and trusted ML models with large to massive data volumes using H2O. We first overview the full capability set of H2O-3 and Sparkling Water for model building. From there, we demonstrate these capabilities by engineering features, building and optimizing supervised learning models, building H2O models embedded in Spark pipelines, building unsupervised models using H2O algorithms, and reviewing how to update and ensure the reproducibility of H2O models. From there, we introduce in depth a number of methods for interpreting and understanding the decision-making process of your model and introduce auto-documentation within H2O. Finally, we do an extensive and thorough exercise in model building from problem statement and raw data through data cleaning, feature engineering, model building and optimization, and candidate model selection based...