View Submission

A0165

Title: Some models are useful, but for how long: A decision-theoretic approach about when to refit large-scale models Authors: Kentaro Hoffman - University of Washington (United States) [presenting]
Tyler McCormick - University of Washington (United States)
Abstract: Large-scale prediction models using tools from artificial intelligence (AI) or machine learning (ML) are increasingly common across a variety of industries and scientific domains. Despite their effectiveness, training AI and ML tools at scale can cost tens or hundreds of thousands of dollars (or more), and even after a model is trained, substantial resources must be invested to keep models up-to-date. A decision-theoretic framework is presented for deciding when to refit an AI/ML model when the goal is to perform unbiased statistical inference using partially AI/ML-generated data. Drawing on portfolio optimization theory, the decision of ${\it recalibrating}$ a model or statistical inference versus ${\it refitting}$ the model is treated as a choice between "investing'' in one of two "assets.'' One asset, recalibrating the model based on another model, is quick and relatively inexpensive but bears uncertainty from sampling and may not be robust to model drift. The other asset, ${\it refitting}$ the model, is costly but removes the drift concern (though not statistical uncertainty from sampling). A framework is presented for balancing these two potential investments while preserving statistical validity. The framework is evaluated using simulation and data on electricity usage and predicting flu trends.