Title: CESME: Cluster analysis with latent semiparametric mixture models
Authors: Wen Zhou - Colorado State University (United States) [presenting]
Hui Zou - University of Minnesota (United States)
Lyuou Zhang - Colorado State University (United States)
Lulu Wang - Gilead Sciences (United States)
Abstract: Model-based clustering is one of the most popular statistical approaches in unsupervised learning and has been widely employed in practice for exploratory analysis, data visualization, sub-community identification, and quality control. Regardless of its wide applicability, the traditional distributional assumption of Gaussianity is too stringent to be validated in general, and therefore prevents the model-based clustering to be used for data with complex distributions, such as high skewness. We propose a flexible semiparametric latent model to cluster multivariate data deviated from Gaussian. The model assumes that the observed random vectors are obtained from unknown monotone transformations of latent variables governed by a Gaussian mixture distribution. The identifiability of the proposed model is carefully studied. An alternating maximization procedure is developed to estimate the proposed model, whose convergence property is investigated by using finite-sample analysis. An interesting transition phenomenon of the convergence for the proposed algorithm, which is due to the presence of the unknown transformations, is explored and provides guidance on the design of the algorithm. The proposed method is also numerically assessed through extensive simulations, and demonstrates superior performance compared to most of the contemporary competitors.