Exploring and understanding the dataset
As we learned in Chapter 4, Predicting Numerical Values with Linear Regression, before diving into the ML implementation, it's necessary to analyze the data available for our use case. We need to begin by having a clear understanding of the data that can be used for our business scenario.
Understanding the data
To start exploring the data, we need to do the following:
- Log in to Google Cloud Console and access the BigQuery user interface from the navigation menu.
- Create a new dataset in the project that we created in Chapter 2, Setting Up Your GCP and BigQuery Environment. For this use case, we'll create the
05_chicago_taxi
dataset with the default options. - Open the
bigquery-public-data
GCP project that hosts all the BigQuery public datasets and browse the items until you find thechicago_taxi_trips
dataset. In this public dataset, we can see only one BigQuery table:taxi_trips
. This table contains all the information...