A0283
Title: Minimax-optimal dimension-reduced clustering for high-dimensional nonspherical mixtures
Authors: Yuqi Gu - Columbia University (United States) [presenting]
Abstract: In mixture models, nonspherical (anisotropic) noise within each cluster is widely present in real-world data. Both the minimax rate and optimal statistical procedure for clustering under high-dimensional nonspherical mixture models are studied. In high-dimensional settings, the information-theoretic limits for clustering are first established under Gaussian mixtures. The minimax lower bound unveils an intriguing informational dimension-reduction phenomenon: There exists a substantial gap between the minimax rate and the oracle clustering risk, with the former determined solely by the projected centers and projected covariance matrices in a low-dimensional space. Motivated by the lower bound, a novel computationally efficient clustering method is proposed: Covariance projected spectral clustering (COPO). Its key step is to project the high-dimensional data onto the low-dimensional space spanned by the cluster centers and then use the projected covariance matrices in this space to enhance clustering. Tight algorithmic upper bounds are established for COPO, both for Gaussian noise with flexible covariance and general noise with local dependence. The theory indicates the minimax-optimality of COPO in the Gaussian case and highlights its adaptivity to a broad spectrum of dependent noise. Extensive simulation studies under various noise structures and real data analysis demonstrate our method's superior performance.