View Submission

A0547

Title: Revisiting parameter estimation and model selection in Gaussian mixtures of experts for clustering and regression Authors: Trung Tin Nguyen - Queensland University of Technology (Australia) [presenting]
Christopher Drovandi - Queensland University of Technology (Australia)
Nhat Pham Minh Ho - University of Texas at Austin (United States)
Abstract: Mixture of experts (MoE) models constitute a widely utilized class of ensemble learning approaches in statistics and machine learning, known for their flexibility and computational efficiency. Despite their practical success, the theoretical understanding of model selection, especially concerning the optimal number of mixture components or experts, remains limited and poses significant challenges. These challenges primarily stem from the inclusion of covariates in both the Gaussian gating functions and expert networks, which introduces intrinsic interactions governed by partial differential equations with respect to their parameters. The use of dendrograms of mixing measures is revisited, and Bayesian nonparametric techniques are incorporated to avoid predefining the number of experts. The approach enables consistent estimation of the true number of components and achieves pointwise optimal convergence rates in overfitted regimes. Importantly, it eliminates the need to train and compare multiple models with different component numbers, reducing computational costs in high-dimensional or deep learning contexts. Experiments on synthetic datasets confirm the effectiveness of the proposed method, showing superior performance over conventional criteria such as AIC, BIC, and ICL in both expert recovery and parameter estimation accuracy.