Title: Multiple imputation using sequential regression for high-dimensional data
Authors: Faisal Maqbool Zahid - Ludwig-Maximilians-University Munich Germany (Germany) [presenting]
Christian Heumann - Ludwig-Maximilians-University Munich (Germany)
Abstract: Missing data is a ubiquitous in almost every field of research. Multiple Imputation (MI) is a commonly used technique to fill missing data with plausible values. It is common in applied research to face large number of variables for moderate number of cases. In such situations, the existing standard MI approach either performs poorly or fails to respond for $p > n$. A very limited literature is available to cope the issue while using MI. The question still of interest is about the best possible strategy to impute the missing data multiply with high-dimensional data. To address the issue, we are proposing a penalized version of standard MI. We are using lasso regression with high-dimensional data to impute the missing values. We compare the performance of our algorithm, with increasing dimension of the data, with some existing algorithms in different simulation studies. The performance is compared using Mean Squared Imputation Error (MSIE) and Mean Absolute Imputation Error (MAIE). The results of the study suggest that using lasso based MI approach is better option for imputation with high-dimensional data.