Transductive Support Vector Machines (TSVM)

Another approach to the same problem is offered by the TSVM, proposed by T. Joachims (in Transductive Inference for Text Classification using Support Vector Machines, Joachims T., ICML Vol. 99/1999). The idea is to keep the original objective with two sets of slack variables: the first for the labeled samples and the other for the unlabeled ones:

As this is a transductive approach, we need to consider the unlabeled samples as variable-labeled ones (subject to the learning process), imposing a constraint similar to the supervised points. As for the previous algorithm, we assume we have N labeled samples and M unlabeled ones; therefore, the conditions become as follows:

The first constraint is the classical SVM one and it works only on labeled samples. The second one uses the variable y^(u)_j with the corresponding slack variables ξ_j to impose a similar condition on the unlabeled samples, while the third one is necessary to constrain the labels to being equal to -1 and 1.

Just like the semi-supervised SVMs, this algorithm is non-convex and it's useful to try different methods to optimize it. Moreover, the author, in the aforementioned paper, showed how TSVM works better when the test set (unlabeled) is large and the training set (labeled) is relatively small (when a standard supervised SVM is outperformed). On the other hand, with large training sets and small test sets, a supervised SVM (or other algorithms) are always preferable because they are faster and yield better accuracy.