Using the K-Means clustering model
In this section, we'll understand how to use our K-Means clustering model on new data.
To use our BigQuery ML model, we'll use the ML.PREDICT
function on the same table that we've created to train the machine learning model.
In this case, we'll also include the taxi_id
column, which identifies each taxi driver. The following query will classify each taxi_id
field to the nearest cluster, according to the values of the speed_mph
and tot_income
fields:
SELECT * EXCEPT(nearest_centroids_distance) FROM ML.PREDICT( MODEL `07_chicago_taxi_drivers.clustering_by_speed_and_income`, ( SELECT * FROM `07_chicago_taxi_drivers.taxi_speed_and_income` ));
The query statement is composed of a SELECT
keyword that extracts all the columns returned...