A0179
Title: On the existence of simpler-yet-accurate models
Authors: Lesia Semenova - Microsoft Research (United States) [presenting]
Abstract: Finding optimal, sparse, accurate models of various forms (such as linear models with integer coefficients, rule lists, and decision trees) is generally NP-hard. Often, it is unknown whether the search for a simpler model will be worthwhile, and thus the effort to find one is not undertaken. The purpose is to address an important practical question: For which types of datasets interpretable models are expected to perform as well as black-box models? A mechanism of the data generation process is presented, coupled with choices usually made by the analyst during the learning process, that leads to the existence of simpler-yet-accurate models. This mechanism indicates that such models exist in practice more often than one might expect.