A1172
Title: Extending TCLUST to higher dimensions
Authors: Luis Angel Garcia-Escudero - Universidad de Valladolid (Spain) [presenting]
Lucia Trapote-Reglero - Universidad de Valladolid (Spain)
Agustin Mayo-Iscar - Universidad de Valladolid (Spain)
Abstract: Outliers are known to significantly distort the results of many commonly used clustering methods, often leading to unreliable cluster partitions. To address this issue, different robust clustering approaches have been developed that not only reduce the influence of but also facilitate the detection of meaningful outliers. The focus is on robust clustering methods based on trimming, especially TCLUST, which extends the type of trimming used by MCD in one-population problems to allow for different subpopulations or clusters unknown in advance. While TCLUST performs well on low-dimensional data, it struggles with high-dimensional datasets due to the complexity involved in estimating a large number of parameters. The robust linear grouping (RLG) method offers an alternative by assuming clusters lie near lower-dimensional subspaces, thus combining clustering with dimensionality reduction. However, RLG has limitations when subspaces intersect and assumes simplistic isotropic orthogonal errors. A robust clustering method extending TCLUST is presented, which builds on the high-dimensional data clustering (HDDC) method by including trimming and eigenvalue constraints. This approach balances TCLUST and RLG, requiring careful adaptation of TCLUST and HDDC steps for proper implementation. An extension allowing for cellwise trimming is also outlined.