COMPSTAT 2023: Start Registration
View Submission - COMPSTAT2023
A0296
Title: Anomaly component analysis: Visualization and interpretability for anomaly detection Authors:  Romain Valla - Telecom Paris, Institut Polytechnique de Paris (France) [presenting]
Pavlo Mozharovskyi - Telecom Paris, Institut Polytechnique de Paris (France)
Florence d Alche-Buc - Telecom Paris (France)
Abstract: At the crossway of Machine Learning and Data Analysis, anomaly detection aims at identifying observations that exhibit abnormal behaviour. Be it measurement errors, disease development, severe weather, production quality default(s) (items) or failed equipment, financial frauds or crisis events, their on-time identification and isolation constitute an important task in almost any area of industry and science. While a substantial body of literature is devoted to the detection of anomalies, little attention is paid to their explanation. This is the case mostly due to the intrinsically non-supervised nature of the task and the non-robustness of the exploratory methods like the principal component analysis (PCA). We introduce a new statistical tool dedicated to the exploratory analysis of abnormal observations using data depth as a score. Anomaly component analysis (shortly ACA) is a method that searches a low-dimensional data representation that best visualises and explains anomalies. Based on this, we further propose a procedure for finding clusters of anomalies in Euclidean space. In a comparative study using simulated and real data, ACA proves advantageous for anomaly analysis with respect to methods present in the literature.