Local Sparkling Water cluster
Running Sparkling Water locally is similar to running H2O-3 locally, but with Spark dependencies. See this link for a full explanation of the Spark, Python, and H2O components involved: https://docs.h2o.ai/sparkling-water/3.2/latest-stable/doc/pysparkling.html.
We will be using Spark 3.2 here. To use a different version of Spark, go to the Sparkling Water section of the H2O downloads page at the following link: https://h2o.ai/resources/download/.
For your Sparkling Water Python client, you must use Python 2.7.x, 3.5.x, 3.6.x, or 3.7.x. We will be running Sparkling Water from a Jupyter notebook here.
Step 1 – Install Spark locally
Follow these steps to install Spark locally:
- Go to https://spark.apache.org/downloads.html to download Spark. Make the following choices and then download:
- Spark version: 3.2.x
- Package type: Pre-built for Hadoop 3.3 and later
- Unzip the downloaded file.
- Set the following environment variables...