A Selective Sampling Strategy for Label-Ranking

Massih-Reza Amini(1), Nicolas Usunier(1), François Laviolette(2), Alexandre Lacasse(2), Patrick Gallinari(1)
(1) Laboratoire d'Informatique Paris 6                        (2)Université Laval
                      8, rue du capitaine scott                                  Pavillon Adrien-Pouliot         
                                75015 Paris                                           G1K7P4 Canada              

We propose a novel active learning strategy based on the compression framework for label ranking functions which, given an input instance, predict a total order over a predefined set of alternatives. Our approach is theoretically motived by an extension to ranking and active learning of Kääriänen's generalisation bounds using unlabeled data, initially developed in the context of classification. The bounds we obtain suggest a selective sampling strategy provided that a sufficiently, yet reasonably large initial labeled dataset is provided. Experiments on Information Retrieval corpora from automatic text summarization and question/answering show that the proposed approach allows to substantially reduce the labeling effort in comparison to random and heuristic-based sampling strategies.