Dataproc cluster instances are built on Google Compute Engine instances, which means we have a wide variety of machines to choose from, according to our use and budget. Just like Compute Engine instances, Dataproc instances can also use both predefined and custom machine types. In the beta update of Dataproc, we can also use f1-micro CPU to decrease the cost even further, whereas for performance-heavy applications we can choose persistent SSD over persistent disks. Apart from these basic configurations, the following are optional customizations possible with Dataproc instances:
- GPUs
- Automatic zone selection
- Optional preemptibility for lower cost
- High availability
- Scheduled deletion of clusters
- Live scaling (without bringing apps down)
- Single-node sandbox clusters