CMStatistics 2021: Start Registration
View Submission - CMStatistics
B1640
Title: Approximated variational inference based on data augmentation methods Authors:  Cristian Castiglione - Bocconi University (Italy) [presenting]
Mauro Bernardi - University of Padova (Italy)
Abstract: Data augmentation is a powerful expedient that permits to formalize complicated statistical models through an equivalent convenient representation that relies on a set of auxiliary variables. This strategy is often employed for computational purposes to design iterative algorithms with closed-form updates, being fruitful for either optimization or simulation problems. Some remarkable examples are the augmented EM, the Gibbs sampling and the mean-field variational Bayes algorithms. Although their simplicity, data augmentation methods have been proved to suffer for low convergence rate and high sample autocorrelation, in the EM and Markov chain Monte Carlo cases, respectively. No theoretical results are available in the variational Bayes context, even though empirical experience suggests that different choices for the augmentation strategy can strongly affect the goodness of the posterior approximation. This gap is bridged by proving theoretically that the introduction of auxiliary variables leads to a systematic loss of information, which is measured as an increment of the Kullback-Liebler divergence between the approximated and the true posterior density, thereby reducing the global approximation accuracy. The validity of such a result is also supported by several data applications lying on logistic regression, quantile regression, and support vector machine classification.