CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A1235
Title: Asymptotic properties of k-means and its bias correction under high dimensional settings Authors:  Kento Egashira - Tokyo University of Science (Japan) [presenting]
Kazuyoshi Yata - University of Tsukuba (Japan)
Makoto Aoshima - University of Tsukuba (Japan)
Abstract: K-means is widely regarded as an effective method for analyzing high-dimension, low-sample-size (HDLSS) data, yet its asymptotic properties in such settings remain underexplored. The aim is to delve into the behavior of k-means in practical settings, even in the context of high-dimensional data. Based on these findings, a bias-corrected version of k-means is proposed to enhance performance. Additionally, the lack of theoretical analysis is addressed for kernel k-means in high-dimensional settings by examining its properties using a Gaussian kernel. This allows for investigating the theoretical comparison between kernel k-means and standard k-means. Numerical simulations demonstrate the effectiveness of the proposed bias-corrected k-means and conventional k-means, including kernel k-means, in high-dimensional contexts.