CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1155
Title: Clustering mixed-type data Authors:  Marianthi Markatou - University at Buffalo (United States) [presenting]
Abstract: Clustering mixed-type data, measured in interval/ratio and categorical (ordinal or nominal) scale, is a challenging problem. The literature includes a number of algorithms for clustering mixed-type data. KAMILA, the method for clustering mixed-type data that does not require strong assumptions, is first discussed. Subsequently, MEDEA (Multivariate Eigenvalue Decomposition Error Adjustment) is developed, which is a weighting scheme that allows the algorithm to properly handle data sets in which only a subset of variables is related to the underlying cluster structure of interest. MEDEA performs well even in the face of a large number of uninformative variables. The properties of the methods are studied and their performance using Monte Carlo simulations and real data sets is demonstrated.