Label propagation based on Markov random walks
The goal of this algorithm proposed by Zhu and Ghahramani is to find the probability distribution of target labels for unlabeled samples given a mixed dataset. This objective is achieved through the simulation of a stochastic process, where each unlabeled sample walks through the graph until it reaches a stationary absorbing state, a labeled sample, where it stops acquiring the corresponding label. The main difference with other similar approaches is that in this case, we consider the probability of reaching a labeled sample. In this way, the problem acquires a closed form and can be easily solved.
The first step is to always build a k-nearest neighbors graph with all N samples, and define a weight matrix W based on an RBF kernel:
Wij = 0 means that , and are not neighbors, and Wii = 0. The transition probability matrix is built similarly to the scikit-learn label propagation algorithm, as:
In a more compact way, it can...