Comparative Classier Evaluation for Web-scale Taxonomies using Power Law
Rohit Babbar, Ioannis Partalas, Cornélia Metzig, Eric Gaussier and Massih-Reza Amini
Laboratoire d'Informatique de Grenoble
Université Joseph Fourier, Grenoble, France
In the context of web-scale taxonomies such as Mozilla and Yahoo! directories, previous works have shown the existence of power law distribution in the size of the categories for every level in the taxonomy. In this work, we analyse how such high-level semantics can be leveraged to evaluate accuracy of hierarchical classiers which automatically assign the unseen documents to leaf-level categories in the taxonomy. Commonly used evaluation method, which relies on k-fold cross-validation suers from computational challenges for such large scale taxonomies. The proposed technique provides a necessary condition for acceptable performance of a hierarchical classier based on power law behavior. Using this technique for classier evaluation on the publicly available data supports our claim empirically.