View Submission

B0749

Title: Sparse sketches with small inversion bias Authors: Edgar Dobriban - University of Pennsylvania (United States) [presenting]
Abstract: For a tall $n\times d$ matrix $A$ and a random $m\times n$ sketching matrix $S$, the sketched estimate of the inverse covariance matrix $(A^T A)^{-1}$ is typically biased:$E[(A_s^TA_s )^{-1}]\neq (A^TA)^{-1}$, where$A_s =SA$. This phenomenon, which we call inversion bias, arises, e.g., in statistics and distributed optimization, when averaging multiple independently constructed estimates of quantities that depend on the inverse covariance matrix. We develop a framework for analyzing inversion bias, based on our proposed concept of an $(\epsilon,\delta)$-unbiased estimator for random matrices. We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, then after simple rescaling, the estimator $(\frac{m}{m-d}A_s^TA_s )^{-1}$ is $(\epsilon,\delta)$-unbiased for $(A^TA)^{-1}$ with a sketch of size $m=O(d+\sqrt{d}/\epsilon)$. In particular, this implies that for $m=O(d)$, the inversion bias of this estimator is $O(1/\sqrt{d})$, which is much smaller than the $\Theta(1)$ approximation error obtained as a consequence of the subspace embedding guarantee for sub-gaussian sketches. We then propose a new sketching technique, called LEverage Score Sparsified less-embeddings, which uses ideas from both data-oblivious sparse embeddings as well as data-aware leverage-based row sampling methods, to get $\epsilon$ inversion bias for sketch size $m=O(d\log{d}+\sqrt{d}/\epsilon)$ in time$ O(nnz(A)\log{n} + md^2)$, where $nnz$ is the number of non-zeros.