CFE-CMStatistics 2025: Start Registration
View Submission - CFE-CMStatistics 2025
A0226
Title: Bayesian level set clustering Authors:  Miheer Dewaskar - University of New Mexico (United States) [presenting]
David Buch - Two Sigma (United States)
David Dunson - Duke University (United States)
Abstract: Classically, Bayesian clustering interprets each component of a mixture model as a cluster. The inferred clustering posterior is highly sensitive to any inaccuracies in the kernel within each component. As this kernel is made more flexible, problems arise in identifying the underlying clusters in the data. To address this pitfall, a fundamentally different approach is proposed to Bayesian clustering that decouples the problems of clustering and flexible modeling of the data density f. Starting with an arbitrary Bayesian model for f and a loss function for defining clusters based on f, a Bayesian decision-theoretic framework is developed for density-based clustering. Within this framework, a Bayesian level set clustering method is developed to cluster data into connected components of a level set of f. Theoretical support is provided, including clustering consistency, and performance is highlighted in a variety of simulated examples. An application to astronomical data illustrates improvements over the popular DBSCAN algorithm in terms of accuracy, insensitivity to tuning parameters, and providing uncertainty quantification.