EcoSta 2023: Start Registration
View Submission - EcoSta2023
A1124
Title: Parsimonious and semi-constrained models for clustering mixed-type data through a composite likelihood approach Authors:  Monia Ranalli - Sapienza University of Rome (Italy) [presenting]
Roberto Rocci - Sapienza University of Rome (Italy)
Abstract: Twelve parsimonious models for clustering mixed-type (ordinal and continuous) data are proposed. Ordinal and continuous data are assumed to follow a multivariate finite mixture of Gaussians. Two main closely related issues should be faced with when the dimensionality of the data increases: the number of parameters increases exponentially; a large number of ordinal variables makes the full maximum likelihood estimation infeasible. To solve the first issue, the model should be more parsimonious in terms of the number of parameters to estimate. At this aim, a general class of eight parsimonious mixture models for mixed-type data are defined by imposing a factor decomposition on component-specific covariance matrices. The loadings and variances of error terms of the factor model may be constrained to be equal or unequal across mixture components. To add some extra flexibility to maintain a certain degree of parsimony, four further models are defined, where the latent factors in each cluster are the same but with different variances. A nice feature of these semi-constrained models is that, under mild conditions, the factors are unique. In other terms, it is impossible to rotate the factors as in the classical factor analysis model. To solve the second issue, a composite likelihood approach is adopted. Estimates computation is carried out using an EM-type algorithm based on composite likelihood. The proposal is evaluated through a simulation study and an application to real data.