CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0217
Title: Mixture models via continuous sparse regression Authors:  Yohann De Castro - Institut Camille Jordan (France) [presenting]
Abstract: Mixture models are a popular family of model-based clustering methods. One prominent example is given by the expectation-maximization (EM) algorithm in Gaussian mixture models. Taking advantage of recent advances in continuous sparse regression, a new method is introduced, referred to as Beurling LASSO, and it is shown that one can recover (i) The number of components, (ii) The locations of the mixture at a rate of $n^{-1/4}$, (iii) The weights of the mixture at a parametric rate of $n^{-1/2}$, where $n$ is the sample size. When the sample size is large, it is proven that one can reduce the dimensionality of data while preserving important information for clustering. This compressed representation, called a sketch, is significantly smaller than the original data but still retains enough information for our method to operate effectively. It is proven that the sketch size does not depend on the sample size but rather on the number of components and the dimension.