A0386
Title: Model selection confidence sets for mixture order selection
Authors: Alessandro Casa - Free University of Bozen-Bolzano (Italy) [presenting]
Davide Ferrari - University of Bolzano (Italy)
Abstract: Determining the number of components in finite Gaussian mixture models is a critical task in clustering and density estimation. Traditional methods based on information criteria often select a single model, potentially overlooking the inherent uncertainty in model selection and resulting in overconfident or inaccurate inferences. To address this, a set-valued estimator, the model selection confidence set, is introduced. This method identifies all mixture orders that are statistically indistinguishable from the best-selected model, using a penalized likelihood ratio screening procedure. The confidence set provides formal coverage guarantees, with a high probability of containing the true number of components. Its width serves as an indicator of data informativeness: a narrower set suggests stronger evidence for a specific order, while a wider set signals greater uncertainty. The method adapts well to the complexity of the data distribution and demonstrates strong performance in simulations across various scenarios. Real data applications further validate its practical advantages and robustness over traditional single-model selection approaches.