EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0210
Title: On model-based clustering of multivariate categorical sequences Authors:  Yingying Zhang - Western Michigan Univesity (United States) [presenting]
Volodymyr Melnykov - The University of Alabama (United States)
Abstract: Clustering algorithms designed for quantitative data have been explored extensively in the literature. However, many real-life data sets are categorical variables, including categorical sequences. The developed models for such data are designed for univariate categorical sequences. Oftentimes, data described by a single categorical sequence is not accurate enough. Observations expressed by multivariate categorical sequences can be utilized to properly reflect the dynamic nature. Currently, there is a lack of models developed for such type of data. Mixture models for multivariate categorical sequences are proposed, and the model is proven to classify observations successfully. At the same time, the proposed model also enjoys parsimonious properties compared with the traditional first-order Markov model. Both synthetic and real-life applications prove the superiority of the proposed method.