EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0164
Title: A communication efficient boosting method for distributed spectral clustering Authors:  Yingqiu Zhu - University of International Business and Economics (China) [presenting]
Abstract: Spectral clustering is one of the most popular clustering techniques in statistical inference. For large-scale datasets, spectral clustering is typically implemented through distributed computing. However, existing distributed implementations face two major challenges. First, the clustering performance is negatively affected by distributed computing since the topological structure of all objects has to be divided into distributed parts. Second, computer communication within a distributed system results in high communication costs. A communication-efficient algorithm for distributed spectral clustering is proposed to address these issues. The motivation stems from a theoretical comparison between the conventional spectral clustering algorithm, which operates on the entire dataset, and the local spectral clustering, performed on a subsample using a single computer. The critical factor that leads to the difference between the performances of global spectral clustering and local spectral clustering is identified. Based on the findings, a novel approach is proposed that iteratively aggregates the intermediate results generated by local spectral clustering. In this process, only low-dimensional vectors are exchanged between computers. The simulations and real data analysis results demonstrate that the proposed method apparently enhances the performance of distributed spectral clustering with low communication costs.