Automated provisioning of a Spark cluster for development
In this section, you will learn how the platform enables your team to provision an Apache Spark cluster on-demand. This capability of provisioning new Apache Spark clusters on-demand enables your organization to run multiple isolated projects used by multiple teams on a shared Kubernetes cluster without overlapping.
The heart of this component is the Spark operator that is available within the platform. The Spark Kubernetes Operator allows you to start the Spark cluster declaratively. You can find the necessary configuration files in the book's Git repository under the manifests/radanalyticsio
folder. The details of this operator are out of scope for this book, but we will show you how the mechanism works.
The Spark operator defines a Kubernetes custom resource definition (CRD), which provides the schema of the requests that you can make to the Spark operator. In this schema, you can define many things, such as the...