COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0430
Title: A Bayesian approach to ensemble clustering Authors:  Federico Maria Quetti - University of Pavia (Italy)
Silvia Figini - University of Pavia (Italy)
Elena Ballante - Department of Political and Social Sciences, University of Pavia (Italy) [presenting]
Abstract: In the context of ensemble clustering, little attention has been given to the integration of conventional bootstrap methodologies within clustering frameworks. The aim is to bridge this gap by introducing an innovative approach that enhances clustering techniques through the application of Bayesian bootstrap techniques. The method leverages insights gleaned from bootstrap resampling, incorporating a Gaussian mixture as the prior distribution of group densities. The methodology comprises two steps. Initially, the Efron bootstrap method is employed to robustly estimate the parameters of the prior distribution from the available data. Then, a proper Bayesian bootstrap is applied to resample from a mixture of the prior distribution and the empirical distribution of the data. The exploitation of prior knowledge jointly with observed data fosters a synergistic approach, enhancing the adaptability and robustness of the clustering process. Moreover, the application of bootstrap naturally leads to a fuzzy clustering interpretation of the results, providing better interpretability of the algorithm as well as ideas for future advancements. The proposed methodology also shows promising results for the determination of the optimal number of clusters. Varying the number of clusters and the variance of the prior distribution, the method offers fundamental insights into the underlying cluster structure of the data, even in scenarios characterized by high dimensionality of the data.