Label propagation based on Markov random walks
The goal of this algorithm proposed by Zhu and Ghahramani is to find the probability distribution of target labels for unlabeled samples given a mixed dataset. This objective is achieved through the simulation of a stochastic process, where each unlabeled sample walks through the graph until it reaches a stationary absorbing state, a labeled sample where it stops acquiring the corresponding label. The main difference with other similar approaches is that in this case, we consider the probability of reaching a labeled sample. In this way, the problem acquires a closed form and can be easily solved.
The first step is to always build a k-nearest neighbors graph with all N samples, and define a weight matrix W based on an RBF kernel:
Wij = 0 is xi, and xj are not neighbors and Wii = 1. The transition probability matrix, similarly to the Scikit-Learn label propagation algorithm, is built as:
In a more compact way, it can be rewritten as P = D-1W. If...