A Transductive Bound for the Voted Classifier with an Application to Semi-Supervised Learning
Massih-Reza Amini(1), François Laviolette(2), Nicolas Usunier (1)
(1) Laboratoire d'Informatique Paris 6
(2) Université Laval
104, avenue du président
Kennedy
Paviollon Adrien-Pouliot
75016 Paris
Canada, G1K 7P4
We propose two transductive bounds on the risk of majority votes that are estimated over partially labeled training sets. The first
one involves the margin distribution of the classifier and a risk bound on its associate Gibbs classifier. The bound is tight when
so is the Gibbs’s bound and when the errors of the majority vote classifier is concentrated on a zone of low margin. In semi-supervised learning,
considering the margin as an indicator of confidence constitutes the working hypothesis of algorithms which search the decision boundary on low
density regions. Following this assumption, we propose to bound the error probability of the voted classifier on the examples for whose margins
are above a fixed threshold. As an application, we propose a self-learning algorithm which iteratively assigns pseudo-labels to the set of unlabeled
training examples that have their margin above a threshold obtained from this bound. Empirical results on different datasets show the effectiveness
of our approach compared to the same algorithm and the TSVM in which the threshold is fixed manually.