Handling data labeling
In this section, we will look at one of the most time-consuming and important tasks when it comes to preprocessing our dataset for ML training: data labeling. As we learned while looking at high-dimensional reduction and other ML techniques in Chapter 5, Performing Data Analysis and Visualization, for most scenarios, it is vitally important to have labels attached to our samples. As we discussed in Chapter 1, Understanding the End-to-End Machine Learning Process, there are only a few scenarios where unsupervised learning models are sufficient, such as a model that clusters emails as spam or not spam. In most cases, we want to use a supervised model, which means we will require labels.
In the following sections, we will discuss what scenarios require us to do manual labeling and how Azure Machine Learning can help us be as efficient as possible to perform this monotonous task.
Analyzing scenarios that require labels
We will start by looking at the types...