A0769
Title: A Graph-based Approach to Estimating the Number of Clusters
Authors: Lynna Chu - Iowa State University (United States) [presenting]
Abstract: We consider the problem of estimating the number of clusters in a dataset. We propose a non-parametric approach to the problem that utilizes similarity graphs to construct a robust statistic that effectively captures similarity information among observations. This graph-based statistic is applicable to datasets of any dimension, is computationally efficient to obtain, and can be paired with any kind of clustering technique. Asymptotic theory is developed to establish the selection consistency of the proposed approach. Simulation studies demonstrate that the graph-based statistic outperforms existing methods for estimating the number of clusters, especially in the high-dimensional setting.