Hadoop (and Spark) jobs constitute perhaps the single-most important use case for organizations moving to the cloud. So, getting your Hadoop strategy right is really important, and here, Dataproc is a fairly obvious way to get started. One important bit of cost optimization that you ought to perform is using as many pre-emptible instances as possible.
Recall that pre-emptible instances are those that can be taken back by the platform at very short notice. So, if you are using a pre-emptible VM instance, you could have it snatched away at any point with just about 30 seconds to execute a shutdown script and clean up your state.
The flip side of this inconvenience is that pre-emptible VM instances are very cheap. On an apples-to-apples basis, you can expect a pre-emptible instance to cost about 60-80% less than a non-pre-emptible...