Summary
In this chapter, we started digging into our weather dataset, focusing on the problem of data labeling. We learned how to use SageMaker Ground Truth to label large datasets using a combination of human review and automation, how to use custom workflows to aid the labeling process, and how to fight labeling bias by using multiple opinions. We ended with some advice on making sure that the labeling process is secure.
In the next chapter, we'll explore data preparation. We'll run a feature engineering processing job on the full dataset.