CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0304
Title: Clustering categorical data using a Bayesian mixture of finite mixtures of latent class analysis models Authors:  Gertraud Malsiner-Walli - WU Vienna University of Economics and Business (Austria)
Bettina Gruen - WU Vienna University of Economics and Business (Austria) [presenting]
Sylvia Fruehwirth-Schnatter - WU Vienna University of Economics and Business (Austria)
Abstract: A Bayesian approach is proposed for model-based clustering of multivariate categorical data where variables are allowed to be associated within clusters, and the number of clusters is unknown. The approach uses a two-layer finite mixture of mixtures model where the cluster distributions are approximated using latent class analysis models. A careful specification of priors with suitable hyperparameter values is crucial to identify the two-layer structure and obtain a parsimonious cluster solution. The Bayesian estimation is outlined based on Markov chain Monte Carlo sampling with the telescoping sampler, and it describes how to obtain an identified clustering model by resolving the label-switching issue. Empirical demonstrations in a simulation study using artificial data as well as a data set on low back pain indicate the good clustering performance of the proposed approach in case hyperparameters are selected to induce sufficient shrinkage.