A1245
Title: A model-agnostic ensemble framework with built-in LOCO feature importance inference
Authors: Lili Zheng - University of Illinois Urbana - Champaign (United States) [presenting]
Genevera Allen - Rice University (United States)
Luqin Gan - Rice University (United States)
Abstract: Interpretability and reliability are crucial desiderata when machine learning is applied in critical applications. However, generating interpretations and uncertainty quantifications for black-box ML models often costs significant extra computation and held-out data. A novel ensemble framework is introduced where one can simultaneously train a predictive model, and uncertainty quantification is given for its interpretation in the form of leave-one-covariate-out (LOCO) feature importance. This framework is almost model-agnostic and can be applied with any base model for regression or classification tasks. Most notably, it avoids model-refitting and data-splitting, and hence, there is no extra cost, computationally and statistically, for uncertainty quantification. To ensure inference validity without data splitting, a number of challenges are addressed by leveraging the stability of the ensemble training process. Some broad connections to selective inference and other model-agnostic feature importance inference methods. The framework is also demonstrated via some real benchmark datasets.