CFE 2019: Start Registration
View Submission - CMStatistics
Title: An extended k-means clustering procedure with unique factors Authors:  Masamichi Ito - Osaka University (Japan) [presenting]
Kohei Adachi - Osaka University (Japan)
Abstract: A k-means clustering (KMC) procedure is performed for a data matrix of observations by variables with the purpose of classifying the observations into a few clusters. In KMC, the data matrix is modeled as the product of membership and cluster center matrices plus an error matrix. The KMC model is extended by incorporating a unique factor part: we propose a clustering procedure, in which the data matrix is modeled as the sum of an error matrix, the product in the original KMC model, and the unique factor part. This part is the product of a unique factor score matrix and a diagonal matrix, with the former matrix constrained to be column-orthonormal. This orthonormality and the latter matrix being diagonal imply that the unique factors and the variables have a one-to-one correspondence, and each of the factors explains specifically the variation in the correspondence variables which remains unaccounted for by clusters. Thus, the proposed procedure is useful for analyzing the data set including the variables whose variations are not explained well by clusters. We present an alternating least squares algorithm for the proposed procedure and assess its behaviors numerically.