CMStatistics 2015: Start Registration
View Submission - CMStatistics
B0766
Title: Comparison of statistical methods for multivariate outlier detection Authors:  Aurore Archimbaud - Erasmus University Rotterdam (Netherlands) [presenting]
Anne Ruiz-Gazen - Toulouse School of Economics (France)
Klaus Nordhausen - University of Jyvaskyla (Finland)
Abstract: Detection of multivariate outliers is a relevant topic in many fields such as fraud detection or manufacturing defects detection. Several non-supervised multivariate methods exist and some are based on robust and non-robust covariance matrices estimators such as the Mahalanobis distance (MD) and its robust version (RD), the robust Principal Component Analysis (PCA) with its diagnostic plot and the Invariant Coordinate Selection (ICS). The objective is to compare these different methods. Note that all these methods lead to one or several scores associated to each observation and high scores are associated with potential outliers. For robust PCA and ICS, some components are selected and outliers are identified by using some test procedure. This last step is not trivial: relevant cut-offs have to be determined and the simultaneity of tests has to be taken into account. The comparison is performed on simulated data sets with mixtures of Gaussian distributions in the context of a small proportion of outliers and when the number of observations is at least five times the number of variables. The Minimum Covariant Determinant (MCD) estimator is considered. The implementation is based on functions from the R packages: robustbase, rrcov, ICS and CerioliOutlierDetection.