Title: Spectral deconfounding
Authors: Domagoj Cevid - ETH Zurich (Switzerland) [presenting]
Peter Buehlmann - ETH Zurich (Switzerland)
Nicolai Meinshausen - ETH Zurich (Switzerland)
Abstract: High-dimensional regression methods which rely on the sparsity of the ground truth, such as the Lasso, might break down in the presence of confounding variables. If a latent variable affects both the response and the predictors, the correlation between them changes. Such hidden confounding can be represented as a high-dimensional linear model where the sparse coefficient vector is perturbed. For this model, we develop and investigate a class of methods that are based on running the Lasso on preprocessed data. The preprocessing step consists of applying certain spectral transformations that change the singular values of the design matrix. We show that, under some assumptions, one can achieve the optimal $\ell_1$-error rate for estimating the underlying sparse coefficient vector and illustrate the performance on a genomic dataset.