Understanding feature engineering
In this section, we'll understand which techniques we can use to improve the features of a BigQuery ML model before the training stage.
Important note
Feature engineering is the practice of applying preprocessing functions on raw data, to extract features useful for training a ML model. Creating preprocessed features can significantly improve the performance of a ML model.
By design, BigQuery ML automatically applies feature engineering during the training phase when we use the CREATE MODEL
function, but it also allows us to apply preprocessing transformations as well.
In order to automatically apply the feature engineering operations during the training and the prediction stage, we can include all the pre-processing functions into the TRANSFORM
clause when we train the BigQuery ML model.
As we can see from the following code example, we need to use the TRANSFORM
clause before the OPTIONS
clause, and after the CREATE MODEL
statement...