CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1023
Title: Imputation of missing values in multi-view data Authors:  Wouter van Loon - Leiden University (Netherlands) [presenting]
Marjolein Fokkema - Leiden University (Netherlands)
Mark De Rooij - Leiden University (Netherlands)
Abstract: Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This leads to very large quantities of missing data which, especially when combined with high dimensionality, makes the application of conditional imputation methods computationally infeasible. A new imputation method is introduced based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. The performance of the new imputation method is compared with several existing imputation algorithms in simulated data sets. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible.