A Boosting Algorithm for Learning Bipartite Ranking Functions with Partially Labeled Data
Massih-Reza Amini(1), Tuong-Ving Truong(1), Cyril Goutte(2)
(1) Laboratoire d'Informatique Paris 6
(2) National Research Council Canada
104, avenue du président
boulevard Alexandre Taché
This paper presents a boosting based algorithm for learning a bipartite ranking function (BRF) with partially labeled data. Until now different attempts had been made to build a BRF in a transductive setting, in which the test points are given to the methods in advance as unlabeled data. The proposed approach is a semi-supervised inductive ranking algorithm which, as opposed to transductive algorithms, is able to infer an ordering on new examples that were not used for its training. We evaluate our approach using the TREC-9 Ohsumed and the Reuters-21578 data collections, comparing against two semi-supervised classification algorithms for ROCArea (AUC), uninterpolated average precision (AUP), mean precision$@50$ (TP) and Precision-Recall (PR) curves. In the most interesting cases where there are an unbalanced number of irrelevant examples over relevant ones, we show our method to produce statistically significant improvements with respect to these ranking measures.