CMStatistics 2023: Start Registration
View Submission - CMStatistics
B0639
Title: Bayesian mixture models inconsistency for the number of clusters Authors:  Louise Alamichel - Universite Grenoble Alpes (France) [presenting]
Julyan Arbel - Inria (France)
Guillaume Kon Kam King - ()
Daria Bystrova - Universite Grenoble Alpes (France)
Abstract: Bayesian non-parametric mixture models are commonly used to model complex data. Although these models are well suited to density estimation, their application to clustering has certain limitations. Recent results proved posterior inconsistency of the number of clusters when the true number of clusters is finite for the Dirichlet and Pitman-Yor process mixture models. Some possible solutions have also been proposed recently to achieve consistency for the number of clusters, notably in the case of the Dirichlet process by using a post-processing algorithm or putting a hyperprior on the parameter. These results are discussed and extended to other non-parametric Bayesian priors such as Gibbs-type processes and their finite-dimensional representations such as the Dirichlet multinomial or Pitman-Yor multinomial processes. It is proven that mixture models based on these processes are also inconsistent concerning the number of clusters. It is also shown that the post-processing algorithm can be extended to more general models and provides a consistent method for estimating the number of components. Finally, the role played in consistency is studied for the Pitman-Yor process, by a hyperprior on the parameters.