CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1052
Title: MixtureMissing: an R package for robust and flexible model-based clustering with incomplete data Authors:  Hung Tong - The University of Alabama (United States) [presenting]
Abstract: Model-based clustering refers to a broad class of cluster analysis that aims to uncover heterogeneity in a data set by means of a finite mixture model. Often in real applications, data come in the form of partially observed records and can exhibit heavy tails, asymmetry, and skewness. Given these practical challenges, developing robust and flexible model-based clustering methods for incomplete data has been a particularly active research area. In the presence of mild outliers, the multivariate contaminated normal mixture (MCNM) is a robust clustering method that can yield robust estimation for component parameters and perform outlier detection automatically. On the other hand, the multivariate generalized hyperbolic mixture (MGHM) and its limiting case, the multivariate skew-t mixture (MStM), present flexible tools that encompass a variety of non-Gaussian mixtures. To make these models more accessible to statisticians and researchers from different fields, the R package MixtureMissing includes them all and provides their implementation in data sets with missing values at random. In addition, various initialization strategies, information criteria for model selection, model summaries, and visualization tools are also readily available.