Title: Dilemma: Distributed learning with Markov Chain Monte Carlo algorithms
Authors: Ali Zaidi - Microsoft (United States) [presenting]
Abstract: Many scalable Monte Carlo algorithms conduct local updates across batched datasets to construct reduced variance estimators that are inherently biased. By reweighting the samples processed by each node in a distributed clustering environment, it is possible to reduce the overall bias of the algorithm while attaining computationally efficient and low variance estimators. However, theoretical guarantees on the convergence of the algorithm (or the infinitesimal generator of the algorithm) to the true target distribution are no longer valid due to the asymptotic bias induced by such distributed computations. Stein's method for bounding divergence of measures is described to compute the discrepancy between the target distribution and sampled distribution. It further supplements this methodology with the Malliavin calculus for estimating functionals of Gaussian processes, in order to utilize the discrepancy measure to produce automatically tuned Meteropolis proposals to optimize exploration/exploitation schemes for complex target distributions. The aim is to summarize these results, to describe how to use the accompanying R package, and to provide examples for the overall distributed learning framework.