View Submission

A1372

Title: On statistical reproducibility of ROC-based diagnostic tests Authors: Tahani Coolen-Maturi - Durham University (United Kingdom)
Frank Coolen - Durham University (UK)
Hamdah Alshamari - Durham University (United Kingdom) [presenting]
Abstract: Hypothesis testing based on diagnostic measures is widely applied in medical science and healthcare, yet repeated testing can lead to different conclusions. Assessing the statistical reproducibility of such tests is therefore essential to ensure reliable results. Reproducibility probability (RP) measures the probability that the same outcome would be obtained if a test were repeated under identical conditions with the same sample size. Nonparametric predictive inference (NPI) offers a predictive framework for studying RP, using the NPI bootstrap method to evaluate RP for accuracy tests. Unlike traditional methods, NPI focuses on prediction rather than estimation within a frequentist framework. RP is applied to the area under the ROC curve (AUC) test, with a simulation study illustrating the NPI reproducibility approach. The AUC is a widely used metric for evaluating the performance of diagnostic tests in distinguishing between diseased and non-diseased individuals. The findings show that the RP of AUC tests can be low, particularly when the p-value is near the significance threshold, raising important concerns about reproducibility in diagnostic test evaluation.