EcoSta 2023: Start Registration
View Submission - EcoSta2023
A0908
Title: Perturbation analysis of randomized SVD and its applications to high-dimensional statistics Authors:  Minh Tang - North Carolina State University (United States) [presenting]
Yichi Zhang - North Carolina State University (United States)
Abstract: Randomized singular value decomposition (RSVD) is a class of computationally efficient algorithms for computing the truncated SVD of large data matrices. Given an $n\times n$ symmetric matrix $M$, the prototypical RSVD algorithm outputs an approximation of the $k$ leading singular vectors of $M$ by computing the SVD of $M^gG$, where $g$ is an integer, and $G$ is a random Gaussian sketching matrix. The statistical properties of RSVD are discussed under a general signal-plus-noise framework, i.e., the observed matrix $\hat{M}$ is assumed to be an additive perturbation of some true but unknown signal matrix $M$. Upper bounds are first derived for the spectral and $2->\infty$ norms between the approximate singular vectors of $\hat{M}$ and the true singular vectors of the signal matrix $M$. These upper bounds depend on the signal-to-noise ratio (SNR) and the number of power iterations g. A phase transition phenomenon is observed in which a smaller SNR requires larger values of g to guarantee convergence of the spectral and $2->\infty$ distances. Finally, normal approximations are presented for the row-wise fluctuations of the approximate singular vectors and the entrywise fluctuations of the approximate matrix. The theoretical results are illustrated by deriving nearly-optimal performance guarantees for RSVD when applied to three statistical inference problems: community detection, matrix completion, and principal component analysis with missing data.