COMPSTAT 2016: Start Registration
View Submission - COMPSTAT
Title: Recovery of weak signal in high dimensional linear regression by data perturbation Authors:  Yongli Zhang - University of Oregon (United States) [presenting]
Abstract: How to recover the weak signal (i.e. small nonzero regression coefficients) is a difficult task in high dimensional data with multicollinearity. Both exhaustive search and stepwise methods fail to select the true model as the nonzero coefficients are below some threshold. We propose a procedure, Perturbed Model Selection (PMS), to recover weak signal by adding random perturbations to the feature matrix. It is shown through theory and simulations that PMS achieves substantial improvement upon the chance of recovering informative features and outperforms other methods at a limited expense of computation. In theory, the aim is to derive a quantitative relationship between selection consistency and computing and to demonstrate the trade-off between them. The real data example revealed that PMS improved forecasting by combining the power of decorrelation and resampling.