Semi-supervised learning methods
Semi-supervised learning [5] is applied when a dataset consists of both labeled and unlabeled data. This is often because labeling data is expensive or difficult to obtain. Therefore, part of the data is unlabeled. However, we still want to take this data into account by using a method that generates outputs according to the target labels, but also conforms to the latent structure patterns that are discovered via unlabeled data.
Semi-supervised learning methods are usually a combination of supervised and unsupervised learning strategies and enable a more accurate model than the ones that are obtained using only labeled or unlabeled data. For this combination to provide added value compared to using only labeled data and purely supervised training, some assumptions (to an extent equivalent) about data need to hold:
- Smoothness assumption: If points and have similar values, their corresponding target values, and , need to be similar
- Low...