View Submission

A0917

Title: On the handling of method failure in comparison studies Authors: Milena Wuensch - LMU Munich (Germany) [presenting]
Moritz Herrmann - LMU Munich (Germany)
Elisa Noltenius - LMU Munich (Germany)
Mattia Mohr - LMU Munich (Germany)
Tim Morris - MRC Clinical Trials Unit at UCL (United Kingdom)
Anne-Laure Boulesteix - LMU Munich (Germany)
Abstract: Comparison studies in machine learning or classical statistics are intended to compare methods in an evidence-based manner, offering guidance to data analysts to select a suitable method. A common challenge is to handle the failure of some methods to produce a result for some (real or simulated) data sets so that their performances cannot be measured in these instances. Despite an increasing emphasis in recent literature, there is little guidance on proper handling and interpretation, and reporting of the chosen approach is often neglected. The aim is to fill this gap and provide practical guidance for handling method failure in comparison studies. In particular, it is shown that two of the most commonly applied approaches, namely discarding the corresponding data sets (either for all or only the failing methods) and imputing, can lead to misleading method recommendations. It also illustrates how method failure in published comparison studies- such as those in simulation studies and real-data benchmarking across different contexts like regression modelling, statistical testing, and machine learning- may manifest in different ways but is always caused by a complex interplay of two aspects. Based upon this, more adequate recommendations are provided for dealing with method failures that are not based on discarding data or imputation. Finally, the recommendations and the dangers of inadequate handling of method failure are illustrated through two illustrative comparison studies.