Learning from non-IID data: Theory, Algorithms and Practice

^Objective

This workshop addresses the problem of learning from data that are not independently and identically distrbuted (IID), knowing that IIDness is a common assumption made in statistical machine learning. If this assumption helps to study the properties of learning procedures (e.g. generalization ability), and also guides the building of new algorithms, there are many real world situations where it does not hold. This is particularly the case for many challenging tasks of machine learning that have recently received much attention such as (but not limited to): ranking, active learning, hypothesis testing, learning with graphical models, prediction on graphs, mining (social) networks, multimedia or language processing. The goal of this workshop is to bring together research works aiming at identifying problems where either the assumption of identical distribution or independency, or both, is violated, and where it is anticipated that carefully taking into account the non-IIDness is of primary importance.
Examples of such problems are:

Bipartite ranking or, more generally, pairwise classification, where pairing up IID variables entails non-IIDness: while the data may still be identically distributed, it is no longer independent;

Active learning, where labels for specific data are requested by the learner: the independence assumption is also violated;

Learning with covariate shift, where the training and test marginal distributions of the data differ: the identically distributed assumption does not hold.

Online learning with streaming data, when the distribution of the incoming examples changes over time: the examples are not identically distributed.

We see the workshop as a venue not only for the presentation of papers focusing on carefully dealing with non-IID data, but also as a forum for sharing ideas across different application domains. Henceforth, it will be an opportunity for discussions on methods that address non-IIDness from the following standpoints:

Theoretical: results on generalization bounds and learnability, contributions that mathematically formalize the types of non-IIDness encountered, results on the extent to which non-IIDness does not harm the validity of theoretical results build on the IID assumption, helpfulness of the online learning framework,

Algorithmic: theoretically motivated algorithms designed to handle non-IID data, approaches that make it possible for classical learning results to carry over, online learning procedures,

Practical: successful applications of non-IID learning methods to learning from streaming data, web data, biological data, multimedia, natural language, social network mining.

^{Submission format}

Please send to lniid09[at]liste.lif.univ-mrs.fr, in PDF or postscript in the LNCS format, a full paper of at most 8 pages for one of the tracks below:

Oral presentation,
Poster splotlights,
Posters.

^Organizers

Massih-Reza Amini, National Research Council, Canada
Amaury Habrard, University of Marseille, France
Liva Ralaivola, University of Marseille, France
Nicolas Usunier, University Pierre et Marie Curie, France

^{Program Committee}

Shai Ben-David, University of Waterloo, Canada
Gilles Blanchard, Fraunhofer FIRST (IDA), Germany
Stéphan Clémençon, Télécom ParisTech, France
François Denis, University of Provence, France
Claudio Gentile, University dell'Insubria, Italy
Balaji Krishnapuram, Siemens Medical Solutions, USA
François Laviolette, Université Laval, Canada
Xuejun Liao, Duke University, USA
Richard Nock, University Antilles-Guyane, France
Daniil Ryabko, Institut National de Recherche en Informatique et Automatique, France
Marc Sebban, University of Saint-Etienne, France
Ingo Steinwart, Los Alamos National Labs, USA
Masashi Sugiyama, Tokyo Institute of Technology, Japan
Nicolas Vayatis, École Normale Supérieure de Cachan, France
Zhi-Hua Zhou, Nanjing University, China

^{Keynote Speakers}

Shai Ben-David, University of Waterloo, Canada

Title: Towards theoretical understanding of domain adaptation learning

Abstract: Machine learning enjoys a deep and powerful theory that has led to a wide variety of highly successful practical tools. However, most of this theory is developed under some simplifying assumptions that clearly fail in the real world. In particular, a fundamental assumption of the theory is that the data available for training and the data of the target application come from the same source. When this assumption fails, the learner is faced with a “domain adaptation” challenge. In the past few years, the range of machine learning applications have been expanded to include various tasks requiring domain adaptation. Such application have been addressed by several heuristic paradigms. However, the common theoretical models fall short of providing useful analysis of these techniques. The key to domain adaptation is the similarity between the training and target domains. In this talk I will discuss several parameters along which task similarity can be defined and measured and discuss to what extent can they be utilized to direct learning algorithms and guarantee their success. Recent work can provide theoretical justification to some existing practical heuristics, as well as guide the development of novel algorithms for handling some types of data discrepancies. However, our current understanding leaves much to be desired. I shall devote the last part of the talk to describing some of the challenges and open questions that will have to be addressed before one can claim satisfactory understanding of learning in the presence of training-test discrepancies. The talk is based on joint works with John Blitzer, Koby Crammer and Fernando Pereira and with my students, David Pal, Teresa Luu and Tyler Lu.

Nicolas Vayatis, École Normale Supérieure de Cachan, France

Title: Empirical risk minimization with statistics of higher order with examples from bipartite ranking

Abstract: Statistical learning theory was mainly developed in the framework of binary classification under the assumption that observations in the training set form an i.i.d. sample. The techniques involved in order to provide statistical guarantees for state-of-the-art learning algorithms are borrowed from the theory of empirical processes. This is made possible not only because of the "i.i.d." assumption on the data but also because of the nature of the performance measures, such as classification error or margin error, which are statistics of order one. In the talk, I will discuss a variety of questions which arise in the theory when more involved criteria are considered. The problem of bipartite ranking through ROC curve optimization provides a prolific source of optimization functionals which are statistics of order strictly larger than one and several examples will be presented.



	Call for Papers Program Proceedings ECML 2009 Sponsors Important dates Submission: 10 June 09 Notification: 30 June 09 Final camera ready: 15 August 09	^Objective This workshop addresses the problem of learning from data that are not independently and identically distrbuted (IID), knowing that IIDness is a common assumption made in statistical machine learning. If this assumption helps to study the properties of learning procedures (e.g. generalization ability), and also guides the building of new algorithms, there are many real world situations where it does not hold. This is particularly the case for many challenging tasks of machine learning that have recently received much attention such as (but not limited to): ranking, active learning, hypothesis testing, learning with graphical models, prediction on graphs, mining (social) networks, multimedia or language processing. The goal of this workshop is to bring together research works aiming at identifying problems where either the assumption of identical distribution or independency, or both, is violated, and where it is anticipated that carefully taking into account the non-IIDness is of primary importance. Examples of such problems are: Bipartite ranking or, more generally, pairwise classification, where pairing up IID variables entails non-IIDness: while the data may still be identically distributed, it is no longer independent; Active learning, where labels for specific data are requested by the learner: the independence assumption is also violated; Learning with covariate shift, where the training and test marginal distributions of the data differ: the identically distributed assumption does not hold. Online learning with streaming data, when the distribution of the incoming examples changes over time: the examples are not identically distributed. We see the workshop as a venue not only for the presentation of papers focusing on carefully dealing with non-IID data, but also as a forum for sharing ideas across different application domains. Henceforth, it will be an opportunity for discussions on methods that address non-IIDness from the following standpoints: Theoretical: results on generalization bounds and learnability, contributions that mathematically formalize the types of non-IIDness encountered, results on the extent to which non-IIDness does not harm the validity of theoretical results build on the IID assumption, helpfulness of the online learning framework, Algorithmic: theoretically motivated algorithms designed to handle non-IID data, approaches that make it possible for classical learning results to carry over, online learning procedures, Practical: successful applications of non-IID learning methods to learning from streaming data, web data, biological data, multimedia, natural language, social network mining. ^{Submission format} Please send to lniid09[at]liste.lif.univ-mrs.fr, in PDF or postscript in the LNCS format, a full paper of at most 8 pages for one of the tracks below: Oral presentation, Poster splotlights, Posters. ^Organizers Massih-Reza Amini, National Research Council, Canada Amaury Habrard, University of Marseille, France Liva Ralaivola, University of Marseille, France Nicolas Usunier, University Pierre et Marie Curie, France ^{Program Committee} Shai Ben-David, University of Waterloo, Canada Gilles Blanchard, Fraunhofer FIRST (IDA), Germany Stéphan Clémençon, Télécom ParisTech, France François Denis, University of Provence, France Claudio Gentile, University dell'Insubria, Italy Balaji Krishnapuram, Siemens Medical Solutions, USA François Laviolette, Université Laval, Canada Xuejun Liao, Duke University, USA Richard Nock, University Antilles-Guyane, France Daniil Ryabko, Institut National de Recherche en Informatique et Automatique, France Marc Sebban, University of Saint-Etienne, France Ingo Steinwart, Los Alamos National Labs, USA Masashi Sugiyama, Tokyo Institute of Technology, Japan Nicolas Vayatis, École Normale Supérieure de Cachan, France Zhi-Hua Zhou, Nanjing University, China ^{Keynote Speakers} Shai Ben-David, University of Waterloo, Canada Title: Towards theoretical understanding of domain adaptation learning Abstract: Machine learning enjoys a deep and powerful theory that has led to a wide variety of highly successful practical tools. However, most of this theory is developed under some simplifying assumptions that clearly fail in the real world. In particular, a fundamental assumption of the theory is that the data available for training and the data of the target application come from the same source. When this assumption fails, the learner is faced with a “domain adaptation” challenge. In the past few years, the range of machine learning applications have been expanded to include various tasks requiring domain adaptation. Such application have been addressed by several heuristic paradigms. However, the common theoretical models fall short of providing useful analysis of these techniques. The key to domain adaptation is the similarity between the training and target domains. In this talk I will discuss several parameters along which task similarity can be defined and measured and discuss to what extent can they be utilized to direct learning algorithms and guarantee their success. Recent work can provide theoretical justification to some existing practical heuristics, as well as guide the development of novel algorithms for handling some types of data discrepancies. However, our current understanding leaves much to be desired. I shall devote the last part of the talk to describing some of the challenges and open questions that will have to be addressed before one can claim satisfactory understanding of learning in the presence of training-test discrepancies. The talk is based on joint works with John Blitzer, Koby Crammer and Fernando Pereira and with my students, David Pal, Teresa Luu and Tyler Lu. Nicolas Vayatis, École Normale Supérieure de Cachan, France Title: Empirical risk minimization with statistics of higher order with examples from bipartite ranking Abstract: Statistical learning theory was mainly developed in the framework of binary classification under the assumption that observations in the training set form an i.i.d. sample. The techniques involved in order to provide statistical guarantees for state-of-the-art learning algorithms are borrowed from the theory of empirical processes. This is made possible not only because of the "i.i.d." assumption on the data but also because of the nature of the performance measures, such as classification error or margin error, which are statistics of order one. In the talk, I will discuss a variety of questions which arise in the theory when more involved criteria are considered. The problem of bipartite ranking through ROC curve optimization provides a prolific source of optimization functionals which are statistics of order strictly larger than one and several examples will be presented. ^{Important Dates} Paper submission deadline: June 10, 2009 - Extended deadline: June 21, 2009 Notification of acceptance: June 30, 2009 Final camera ready submissions: August 15, 2009 Workshop: September 7, 2009