CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0582
Title: Evaluating model performance through confidence intervals for the generalization error Authors:  Hannah Schulz-Kuempel - LMU Munich (Germany) [presenting]
Abstract: How accurately can a model predict outcomes for new, unseen data? This central question in predictive modeling is addressed by estimating the generalization error (GE), representing the expected loss between model predictions and true outcomes on a new data point. Resampling methods form the crucial basis for the estimation of the GE without actually requiring new data. For resampling-based point estimates to be meaningful, however, precision information is needed in the form of confidence intervals (CIs). Unfortunately, computing a theoretically valid and practically accurate CI for the GE is complicated by the resampling setup. Despite various proposed methods for deriving these CIs, no universal consensus on the best-performing method for different scenarios currently exists. The performance of thirteen model-agnostic methods is benchmarked for deriving CIs for the GE across various supervised learning models and simulation designs, aiming to provide an unbiased assessment of current techniques, establish a foundation for evaluating future methods, and generate hypotheses for further research. Findings form the basis for cautious recommendations for how to compute CIs for the GE and highlight trends and unexpected behaviors, offering insights into the complexities of resampling-based inference. The performance, intricacies of these methods, and their implications are explored for model evaluation in machine learning for biostatistics.