CMStatistics 2016: Start Registration
View Submission - CMStatistics
B0741
Title: Robust PCA in the presence of outlying cells Authors:  Wannes Van den Bossche - KU Leuven (Belgium) [presenting]
Mia Hubert - KU Leuven (Belgium)
Peter Rousseeuw - KU Leuven (Belgium)
Abstract: Principal component analysis (PCA) is a popular dimension reduction technique that is typically used as a first step when exploring high-dimensional data. It is known that classical PCA is highly sensitive to outliers. Several robust alternatives have been developed which yield accurate loadings in the presence of outliers. However, these methods consider outliers to be entire rows of the data matrix while it often happens that only a few cells in a row are outlying. Downweighting an entire row then leads to an unnecessary loss of information. Furthermore, in high-dimensional data it could easily happen that more than half of the rows contain such cellwise outliers which causes the current rowwise robust PCA methods to break down. We introduce a new method for robust principal component analysis which can handle cellwise outliers. In addition it provides estimates for deviating data cells, as well as for missing values. The code will be made available in the Matlab toolbox LIBRA and in R. Some real data examples will be shown.