Semi-supervised learning is another class of machine learning process and technique that also makes use of unlabeled data for training (as does unsupervised learning) but, typically, a small amount of labeled data with a large amount of unlabeled data is present and used by the model. This is usually referred to as partly labeled data.
Semi-supervised learning falls somewhere between unsupervised learning (without any labeled training data) and supervised learning (with completely labeled training data).
Semi-supervised learning programs do attempt to use certain standard assumptions to help them make use of unlabeled data. These standard assumptions are continuity, cluster, and manifold.
Without going too deep into describing these assumptions, loose definitions are as follows:
- Continuity: This assumption implies that close data points also tends to...