A0697
Title: A Bayesian semiparametric mixture model for clustering zero-inflated microbiome data
Authors: Matthew Koslovsky - Colorado State University (United States) [presenting]
Abstract: Microbiome research has immense potential for unlocking insights into human health and disease. A common goal in human microbiome research is identifying subgroups of individuals with similar microbial composition that may be linked to specific health states or environmental exposures. However, existing clustering methods are often not equipped to accommodate the complex structure of microbiome data and typically make limiting assumptions regarding the number of clusters in the data, which can bias inference. Designed for zero-inflated multivariate compositional count data collected in microbiome research, a novel Bayesian semiparametric mixture modeling framework is proposed that simultaneously learns the number of clusters in the data while performing cluster allocation. In simulation, the clustering performance of the method is demonstrated compared to distance- and model-based alternatives, and the importance of accommodating zero-inflation when present in the data. The model is then applied to identify clusters in microbiome data collected in a study designed to investigate the relation between gut microbial composition and enteric diarrheal disease.