Label propagation
Label propagation is a family of semi-supervised algorithms that rely on the graph representation of a dataset to exploit the existing relations among nodes in order to propagate the labels to unlabeled points. In particular, if we have N labeled points (with bipolar labels +1 and –1) and M unlabeled points (denoted by y = 0), it's possible to build an undirected graph based on a measure of geometric affinity among samples. In the following figure, there's an example of such a structure:
Example of binary graph
The graph is defined as a structure containing two sets G = {V, E}. E is the set of vertices (or nodes) and contains all sample labels V = {–1, +1, 0}, while the edge set E is based on an affinity measure that encodes the strength of the relation between two nodes. For practical reasons, it's helpful to introduce a matrix W whose elements wij are:
- The actual weight of the edge (i, j). In this case...