Preparing the datasets
In this section, we'll learn about which techniques we can apply to ensure that the data we will use to build our ML model is correct and produces the desired results. After that, we'll discover the strategies that we can use to segment the datasets into training, validation, and test sets.
Working with high-quality data
In this section, we'll understand the characteristics that our datasets should have in order to develop effective BigQuery ML models.
Since ML models learn from data, it's very important to feed our ML algorithms with high-quality data, especially during the training phase. Since data quality is a very broad topic, it would require a specific book to analyze it in detail. For this reason, we will focus only on main data quality concepts in relation to the building of a ML model.
Important note
Data quality is a discipline that includes processes, professionals, technologies, and best practices to identify and...