Chapter 6: Advanced Model Building – Part II
In the previous chapter, Chapter 5, Advanced Model Building – Part I, we detailed the process for building an enterprise-grade supervised learning model on the H2O platform. In this chapter, we round out our advanced model-building topics by doing the following:
- Demonstrating how to build H2O supervised learning models within an Apache Spark pipeline
- Introducing H2O's unsupervised learning method
- Discussing best practices for updating H2O models
- Documenting requirements to ensure H2O model reproducibility
We begin this chapter by introducing Sparkling Water pipelines, a method for embedding H2O models natively within a Spark pipeline. In enterprise settings where Spark is heavily utilized, we have found this to be a popular method for building and deploying H2O models. We demonstrate by building a Sparkling Water pipeline for sentiment analysis using data from online reviews of Amazon food...