Label propagation based on Markov random walks
The goal of this algorithm proposed by Zhu and Ghahramani is to find the probability distribution of target labels for unlabeled samples given a mixed dataset. This objective is achieved through the simulation of a stochastic process, where each unlabeled sample walks through the graph until it reaches a stationary absorbing state, a labeled sample, where it stops acquiring the corresponding label. The main difference with other similar approaches is that in this case, we consider the probability of reaching a labeled sample. In this way, the problem acquires a closed form and can be easily solved.
The first step is to always build a k-nearest neighbors graph with all N samples, and define a weight matrix W based on an RBF kernel:
data:image/s3,"s3://crabby-images/86ddc/86ddcdd4482e12568da302a8cdb807a237a14e76" alt=""
Wij = 0 means that , and
are not neighbors, and Wii = 0. The transition probability matrix is built similarly to the scikit-learn label propagation algorithm, as:
data:image/s3,"s3://crabby-images/afea3/afea345f22b28285c616a29ca93c82e5cb2b46b4" alt=""
In a more compact way, it can...