In this chapter, we have covered the main aspects of big data relating to the exam. We have covered each service and shown that these can be used at different stages of our end-to-end solution. We took the time to see how we can configure Pub/Sub, Dataflow, and BigQuery from the GCP console and discussed Dataproc and Cloud IoT Core.
We looked at the processing stage of our solution. Cloud Dataflow will deploy Google Compute Engine instances to deploy and execute our Apache Beam pipeline to process data from Pub/Sub and pass onto further stages for analysis or storage. We have shown how we can easily create a pipeline in the GCP console, which pulls information from Pub/Sub to analyze in BigQuery.
We covered BigQuery and understand now that...