A0800
Title: Statistical inference under dependent Gaussian mixture models
Authors: Rajarshi Mukherjee - Harvard T.H. Chan School of Public Health (United States) [presenting]
Abstract: Gaussian mixture models are widely used to model data generated from multiple latent sources. Despite its popularity, most theoretical research assumes that the labels are independent and identically distributed, and it is unclear how the fundamental limits of estimation change under dependence. This question is addressed for the spherical two-component Gaussian mixture model with dependent labels. It is first shown that for labels with an arbitrary dependence, a naive estimator based on the misspecified likelihood is $\sqrt{n}$-consistent. Additionally, under labels that follow an Ising model, the information-theoretic limitations are established for estimation, and an interesting phase transition is discovered as dependence becomes stronger. The Ising model is a popular quadratic interaction model that allows network dependence. When the dependence is smaller than a threshold, the optimal estimator and its limiting variance exactly match the independent case. This result holds for a wide class of Ising models where the underlying network is dense enough. On the other hand, under stronger dependence, estimation becomes easier, and the naive estimator is no longer optimal. Hence, an alternative estimator is proposed based on the variational approximation of the likelihood, and its optimality is argued under a specific Ising model. In both cases, there is no information-computation gap, and the estimators are tractable.