EcoSta 2021: Start Registration
View Submission - EcoSta2021
A0571
Title: Advances in model-based clustering of high-dimensional data Authors:  Claire Gormley - University College Dublin (Ireland) [presenting]
Abstract: The model-based clustering framework provides well-established methods that uncover sub-groups of observations in data. Such methods bestow several desirable benefits: reproducibility due to their statistical modelling basis, objectivity through the availability of principled model selection tools and interpretability through the provision of parameter estimates and their associated uncertainties. However, model-based clustering approaches begin to lose traction as data dimension increases, whether in terms of the number of observations, variables, timepoints etc. This loss of applicability is often due to stability issues associated with high dimensional covariance matrices, optimisation difficulties and/or the expensive nature of computing the likelihood function. We consider recent advances in model-based methods to clustering data where the number of variables $p$ is large. In particular, we explore developments in factor analytic approaches, which are well-known models for big $p$ data, and recent work utilising composite likelihood methods that facilitate the computation of intractable likelihood functions. The utility of such methods is illustrated through benchmark and real data sets.