COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0455
Title: Taming numerical imprecision by adapting the KL divergence to negative probabilities Authors:  Simon Pfahler - University of Regensburg (Germany) [presenting]
Peter Georg - University of Regensburg (Germany)
Rudolf Schill - ETH Zuerich (Switzerland)
Maren Klever - RWTH Aachen (Germany)
Lars Grasedyck - RWTH Aachen (Germany)
Rainer Spang - University of Regensburg (Germany)
Tilo Wettig - University of Regensburg (Germany)
Abstract: The Kullback-Leibler (KL) divergence is frequently used in data science to compare probability distributions. When considering discrete probability vectors on exponentially large state spaces, one typically needs approximations to keep calculations tractable. This may result in approximations of probability vectors with a few small negative entries, rendering the KL divergence undefined. To address this problem, a parameterized substitute divergence measure, the shifted KL (sKL) divergence, is introduced. In contrast to existing techniques, the approach is not problem-specific and does not increase the computational overhead. The sKL divergence retains many of the useful properties of the KL divergence while being resilient to Gaussian noise in the probability vectors. For a large class of parameter choices, it is proven that the sKL divergence converges to the KL divergence in the limit of small Gaussian noise. In a concrete example that does not satisfy the assumption of Gaussian noise, the tensor-train approximation, it is shown that the method still works reliably. As an example application, it is also shown how the approach can be used in bioinformatics to accelerate the optimization of mutual hazard networks, a type of cancer-progression model.