A1183
Title: A generalized mean approach for distributed-PCA
Authors: Zhi-Yu Jou - Academia Sinica (Taiwan) [presenting]
Su-Yun Huang - Academia Sinica (Taiwan)
Hung Hung - National Taiwan University (Taiwan)
Shinto Eguchi - The Institute of Statistical Mathematics (Japan)
Abstract: Principal component analysis (PCA) is a fundamental technique for dimensionality reduction. With the rapid growth of data in both size and complexity, distributed PCA (DPCA) has gained increasing attention. A key challenge in DPCA is how to efficiently compute and aggregate results across multiple machines while preserving the statistical characteristics of the original dataset. A prior study proposed a communication-efficient DPCA algorithm for estimating the leading rank-r eigenspace of the population covariance matrix. However, their method aggregates only the leading eigenvectors without incorporating eigenvalue information, which may lead to suboptimal estimation accuracy. The aim is to propose a novel DPCA algorithm that incorporates eigenvalue information to aggregate local results via the matrix beta-mean, which is referred to as beta-DPCA. The proposal offers a flexible and robust aggregation method through the adjustable choice of beta values. Moreover, the beta-mean is shown to be associated with the matrix beta-divergence, a subclass of the Bregman matrix divergence, which supports the robustness of beta-DPCA.