Research interests:  Machine learning, information retrieval, computational linguistics, social network analysis.

My research focuses on the modeling of large document collections for information access (mainly categorization, clustering and information retrieval) and is at the intersection of three main research domains: mahcine learning, information retrieval and computational linguistics. I am particularly interested in explaining phenomena encountered in large-scale textual collections and in desinging new models that take into account the properties of such collections. I am also interested in modeling how the textual information is shared in social (content) networks.


Scientific animation

I am a member of the editorial board of Document Numérique, and a past member of the editorial board of Traitement automatique des languesComputational Linguistics and International Journal of Corpus Linguistics. I was program co-chair of EMNLP 2006, area chair for SIGIR 2010, SIGIR 2012, SIGIR 2013, ECIR 2011, ECIR 2012, ECIR 2013, ECIR 2014 and co-chair of the workshops LSHC1 (within ECIR 2010), LSHC2 (within ECML 2011) and LSHC3 (within ECML 2013). I am currently workshop co-chair for EMNLP 2014. I have been a member of the Executive Board of the European Association for Computational Linguistics from 2007 to 2010, a member of the Computer Science panel of the European Research Council for Starting/Consolidator Grants, from 2007 to 2013, and I am a member of the Advisory Board of SIGDAT from 2005.


I am involved in the following projects on these topics:

  • New theoretical frameworks in metric learning (regional project) (Sept. 2013-Sept. 2016)
  • Khronos Persyval (labex) project on data mining of temporal data (Sept. 2013-Sept. 2017)
  • BioASQ (Eur. project) (Oct. 2012-Oct. 2014)
  • ARESOS CNRS Mastodons project (started in 2012)
  • CLASS-Y (ANR project) (February 2011-February 2015)

and was recently involved in the projects:

  • PASCAL2 European network of excellence (2009-2013)
  • MeTRICC (ANR project) (December 2008-December 2011)
  • FRAGRANCES (ANR project) (December 2008-December 2011)
  • LASCAR (LArge Scale CAtegoRization - UJF project) (January 2008-December 2009)
  • INFOM@GIC (French project) (2005-2006 pour ma participation)
  • PASCAL European Network of Excellence (2004-2006)
  • REVEAL THIS (European project) (2004-2007)
  • KerMIT (European project) (2001-2004)
  • MuchMore (European project) (199-2002)
  • Outiller les Alliances (French project) (2001-2003)