CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0511
Title: Understanding uncertainty in Bayesian clustering Authors:  Sara Wade - University of Edinburgh (United Kingdom) [presenting]
Cecilia Balocchi - University of Edinburgh (United Kingdom)
Abstract: The Bayesian approach to clustering is often appreciated for its ability to provide uncertainty in the partition structure. However, summarizing the posterior distribution over the clustering structure can be challenging. A prior study proposed to summarize the posterior samples using a single optimal clustering estimate, which minimizes the expected posterior variation of information (VI). In instances where the posterior distribution is multimodal, it can be beneficial to summarize the posterior samples using multiple clustering estimates, each corresponding to a different part of the space of partitions that receives substantial posterior mass. The aim is to propose finding such clustering estimates by approximating the posterior distribution in a VI-based Wasserstein distance sense. An interesting byproduct is that this problem can be seen as using the k-mediods algorithm to divide the posterior samples into different groups, each represented by one of the clustering estimates. Using both synthetic and real datasets, it is shown that the proposal helps to improve the understanding of uncertainty, particularly when the data clusters are not well separated, or when the employed model is misspecified.