CFE-CMStatistics 2025: Start Registration
View Submission - CFE-CMStatistics 2025
A1422
Title: Clustering microbiome data using a mixture of logistic normal multinomial distributions Authors:  Sanjeena Dang - Carleton University (Canada) [presenting]
Abstract: The human microbiome plays an important role in human health and disease status. Next-generation sequencing technologies allow for quantifying the composition of the human microbiome. Clustering these microbiome data can provide valuable information by identifying underlying patterns across samples. However, clustering these datasets is challenging. Taxa count data in microbiome studies are typically high-dimensional, over-dispersed, and can only reveal relative abundance, and therefore often are treated as compositional. Analyzing such compositional data presents many challenges because they are restricted to a simplex. The aim is to present recent advances in clustering microbiome data using a mixture of logistic normal multinomial models. In a logistic normal multinomial model, the relative abundance of the microbiome is mapped from a simplex to a latent variable in the real Euclidean space using the additive log-ratio transformation. An efficient framework is utilized for parameter estimation using variational Gaussian approximations (VGA). Adopting a variational Gaussian approximation for the posterior of the latent variable reduces the computational overhead substantially. Recent developments using extensions of the LNM distribution to cluster high-dimensional microbiome data are discussed.