Exercise – Datastream ETL streaming to BigQuery
In this exercise, we will walk through the process of setting up a streaming process using Datastream. The pipeline will involve creating and configuring various GCP components to move and transform data from a CloudSQL MySQL table to a BigQuery dataset.
This exercise will be heavy on configurations and will use many different GCP components. I suggest that you open six browser tabs, one for each of the GCP components. To guide you, please use the following diagram as your checklist to make sure all the components are configured correctly:
Figure 6.24 – Datastream end-to-end steps
Most of the steps other than Datastream and Dataflow were covered in this chapter or previous ones. I will not go through every step in too much detail. This is a good chance for you to review your understanding of what we’ve learned so far. Let’s start with the first step.