A0168
Title: Perturbation theory for cross data matrix-based PCA
Authors: Shao-Hsuan Wang - National Central University (Taiwan) [presenting]
Su-Yun Huang - Academia Sinica (Taiwan)
Abstract: PCA has long been a useful tool for dimension reduction. Cross data matrix (CDM)-based PCA is another way to estimate PCA components, through splitting data into two subsets and calculating singular value decomposition for the cross product of the corresponding covariance matrices. CDM-based PCA has a broader region of consistency than ordinary PCA for leading eigenvalues and eigenvectors. We will introduce the finite sample approximation results as well as the asymptotic behavior for CDM-based PCA via matrix perturbation. Moreover, we introduce a comparison measure for CDM-based PCA vs. ordinary PCA. This measure only depends on the data dimension, noise correlations, and the noise-to-signal ratio (NSR). Using this measure, we develop an algorithm, which selects good partitions and integrates results from these good partitions to form a final estimate for CDM-based PCA. Numerical and real data examples are presented for illustration.