Summary
Google Cloud and its data processing services cover many problems. We started by looking at messaging and leveraging Pub/Sub and Pub/Sub Lite to integrate various Google Cloud products. We then learned about Dataproc, where we can quickly run fully managed Apache Hadoop, Apache Spark, or Apache Pig clusters and process massive amounts of data. Finally, Dataflow, a fully managed Apache Beam-based product, allows us to develop, execute, and process data pipelines in a simple, fast, and scalable way.
At the end of this chapter, we discovered another way to interact with Google Cloud services – REST APIs. We learned how to authenticate with Google Cloud using OAuth 2.0 and how to perform simple HTTP REST API calls. We leveraged our knowledge to create a Pub/Sub topic and attach it to its subscriptions.
The next chapter will discuss Google Cloud’s operations suite (formerly Stackdriver), which provides a set of fully managed services for monitoring, logging,...