Annotating Real Data
The fuel of the machine learning (ML) engine is data. Data is available in almost every part of our technology-driven world. ML models usually need to be trained or evaluated on annotated data, not just data! Thus, data by itself is not very useful for ML but annotated data is what ML models need.
In this chapter, we will learn why ML models need annotated data. We will see why the annotation process is expensive, error-prone, and biased. At the same time, you will be introduced to the annotation process for a number of ML tasks, such as image classification, semantic segmentation, and instance segmentation. We will highlight the main annotation problems. At the same time, we will understand why ideal ground truth generation is impossible or extremely difficult for tasks such as optical flow estimation and depth estimation.
In this chapter, we’re going to cover the following main topics:
- The need to annotate real data for ML
- Issues with...