Technical requirements
Before we begin the chapter, make sure you have the following prerequisites ready.
In this chapter's exercises, we will use these GCP services: Dataproc, GCS, BigQuery, and Cloud Composer. If you never open any of these services in your GCP console, open it and enable the API.
Make sure you have your GCP console, Cloud Shell, and Cloud Shell Editor ready.
Download the example code and the dataset here: https://github.com/PacktPublishing/Data-Engineering-with-Google-Cloud-Platform/tree/main/chapter-5.
Be aware of the cost that might occur from Dataproc and the Cloud Composer cluster. Make sure you delete all the environments after the exercises to prevent any unexpected costs.