Title: Bayesian divisive clustering
Authors: Paul Kirk - University of Cambridge (United Kingdom) [presenting]
Christopher Foley - University of Cambridge (United Kingdom)
Abstract: A novel model-based Bayesian divisive clustering algorithm is presented. The algorithm starts with all observations in the same cluster, and iteratively divides the cluster into sub-clusters. Whether or not a cluster should be subdivided is determined using a Bayesian model selection approach, based on the calculation of (approximate) Bayes factors. Adopting an appropriate choice of prior on the space of partitions, we establish links to the Dirichlet process mixture model, and derive approximations that vastly improve scalability. We provide a case study application from genetics, in which traits are clustered together if they share a common causal variant. We demonstrate that the scalability of our approach enables us to perform analyses that would be impossible using competing state-of-the-art techniques.