CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0833
Title: Distributionally robust halfspace depth Authors:  Jevgenijs Ivanovs - University of Aarhus (Denmark)
Pavlo Mozharovskyi - LTCI, Telecom Paris, Institut Polytechnique de Paris (France) [presenting]
Abstract: Statistical data depth function measures the centrality of an observation concerning a distribution or a data set by a value between 0 and 1 while satisfying certain postulates regarding invariance, monotonicity, and convexity. It constitutes a contemporary domain of rapid development that meets growing demand in industry, economy, social sciences, etc. Being one of the most studied depth notions, Tukey's halfspace depth can be seen as a stochastic program, and as such, it suffers from the optimizer's curse: a limited training sample results in a poor out-of-sample performance. A generalized halfspace depth concept is proposed, relying on the recent advances in distributionally robust optimization, where halfspaces are examined using the worst-case distribution in the Wasserstein ball around the empirical law. This new depth can be seen as a smoothed and regularized classical halfspace depth retrieved as the radius of the Wasserstein ball vanishes. It inherits the latter's main properties and enjoys various new attractive features, such as continuity and strict positivity beyond the convex hull of the support. Numerical illustrations of the new depth and its advantages are provided, and some fundamental theory is developed. In particular, the upper-level sets and the median region are studied, including their breakdown properties, and distributionally robust halfspace depth is applied to the tasks of outlier detection and supervised classification.