Technical requirements
In this chapter, we will use the NYPD (Police Department) Motor Vehicle Collisions – Crashes dataset (https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95) provided by the New York Open Data Catalog: https://opendata.cityofnewyork.us.
You may wish to load this data into BigQuery to work through the cleansing and transformation examples provided in this chapter as being hands-on with this chapter’s content will help reinforce the concepts. You may reference the hands-on exercise loading data guide in Chapter 4, Loading and Transforming Data. The raw data is available at the preceding link, as well as this book’s GitHub repository: https://github.com/PacktPublishing/Data-Exploration-and-Preparation-with-BigQuery/releases. You can download the dataset locally, upload it to a Google Cloud Storage (GCS) bucket, then create a new table in BigQuery using the GCS bucket and file as a source.