Label propagation based on Markov random walks

The goal of this algorithm proposed by Zhu and Ghahramani is to find the probability distribution of target labels for unlabeled samples given a mixed dataset. This objective is achieved through the simulation of a stochastic process, where each unlabeled sample walks through the graph until it reaches a stationary absorbing state, a labeled sample where it stops acquiring the corresponding label. The main difference with other similar approaches is that in this case, we consider the probability of reaching a labeled sample. In this way, the problem acquires a closed form and can be easily solved.

The first step is to always build a k-nearest neighbors graph with all N samples, and define a weight matrix W based on an RBF kernel:

Wij = 0 is xiand xj are not neighbors and Wii = 1. The transition probability matrix, similarly to the Scikit-Learn label propagation algorithm, is built as:

In a more compact way, it can be rewritten as P = D-1W. If we now consider a test sample, starting from the state xi and randomly walking until an absorbing labeled state is found (we call this label y), the probability (referred to as binary classification) can be expressed as:

When xi is labeled, the state is final, and it is represented by the indicator function based on the condition yi=1. When the sample is unlabeled, we need to consider the sum of all possible transitions starting from xi and ending in the closest absorbing state, with label y=1 weighted by the relative transition probabilities. 

We can rewrite this expression in matrix form. If we create a vector P = [ PL(y=1|XL), PU(y=1|XU) ], where the first component is based on labeled samples and the second on the unlabeled ones, we can write:

If we now expand the matrices, we get:

As we are interested only in the unlabeled samples, we can consider only the second equation:

Simplifying the expression, we get the following linear system:

The term (Duu - Wuu) is the unlabeled-unlabeled part of the unnormalized graph Laplacian L = D - W. By solving this system, we can get the probabilities for the class y=1 for all unlabeled samples.