Chapter 6: Processing Streaming Data with Pub/Sub and Dataflow
Processing streaming data is becoming increasingly popular, as streaming enables businesses to get real-time metrics on business operations. This chapter describes which paradigm should be used—and when—for streaming data. The chapter will also cover how to apply transformations to streaming data using Cloud Dataflow, and how to store processed records in BigQuery for analysis.
Learning about streaming data is easier when we really do it, so we will exercise creating a streaming data pipeline on Google Cloud Platform (GCP). We will use two GCP services, Pub/Sub and Dataflow. Both of the services are essential in creating a streaming data pipeline. We will use the same dataset as we used for practicing a batch data pipeline. With that, you can compare how similar and different the approaches are.
As a summary, here are the topics that we will discuss in this chapter:
- Processing streaming data ...