CMStatistics 2021: Start Registration
View Submission - CMStatistics
B1282
Title: Estimating the normalizing constant in Bayesian networks Authors:  Fritz Bayer - ETH Zurich (Switzerland) [presenting]
Giusi Moffa - University of Basel (Switzerland)
Niko Beerenwinkel - ETH Zurich (Switzerland)
Jack Kuipers - ETH Zurich (Switzerland)
Abstract: Bayesian Networks are probabilistic graphical models that can efficiently represent dependencies among random variables. Missing data and hidden variables require calculating the probability of a subset of the random variables, the so-called normalizing constant. While knowledge of the normalizing constant is crucial for various problems in statistics and machine learning, its exact computation is usually not possible for categorical variables due to the NP-hardness of this task. We develop a divide-and-conquer approach using the graphical properties of Bayesian networks to split the computation of the normalizing constant into sub-calculations of lower complexity. Exploiting this property, we present an efficient and scalable algorithm for estimating the normalizing constant for categorical variables. Our novel method displays superior performance in a benchmarking study, where we compare it against state-of-the-art approximate inference methods. As an immediate application, we demonstrate how we can use the normalizing constant to classify incomplete data against Bayesian network clusters and use this approach for identifying the cancer subtype of kidney cancer sample genotypes. The proposed scheme enables the efficient application of Bayesian networks on incomplete data or hidden variables and is presented as a general framework that can be generalized to other exact and approximate inference schemes.