A0827
Title: Model-based co-clustering: High dimension and estimation challenges
Authors: Christophe Biernacki - Inria (France) [presenting]
Christine Keribin - INRIA - Paris-Saclay University (France)
Julien Jacques - University Lyon II (France)
Abstract: Model-based co-clustering can be seen as a particularly important extension of model-based clustering. It allows for a significant reduction of both the number of rows (individuals) and columns (variables) of a data set in a parsimonious manner and also allows interpretability of the resulting reduced data set since the meaning of the initial individuals and features is preserved. Moreover, it benefits from the rich statistical theory for both estimation and model selection. Many works have produced new advances on this topic in recent years, and a general update of the related literature is offered. It is the opportunity to advocate two main messages, supported by specific research material: (1) co-clustering requires further research to fix some well-identified estimation issues, and (2) co-clustering is one of the most promising approaches for clustering in the (very) high-dimensional setting, which corresponds to the global trend in modern data sets.