A1649
Title: Identifying microbiome communities and enterotypes using a novel mixed-membership model
Authors: Roberto Ascari - University of Milano-Bicocca (Italy) [presenting]
Alice Giampino - University of Milano-Bicocca (Italy)
Sonia Migliorati - University of Milano Bicocca (Italy)
Abstract: Understanding how the human gut microbiome affects host health is challenging due to the wide interindividual variability, sparsity, and high dimensionality of microbiome data. Recently, mixed-membership models have been applied to these data to detect latent communities of bacterial taxa that are expected to co-occur. The most widely used mixed-membership model is the latent Dirichlet allocation (LDA). However, LDA is limited by the rigidity of the Dirichlet distribution imposed on the community proportions, which hinders its ability to model dependencies and account for overdispersion. To address this limitation, a generalization of LDA that introduces greater flexibility into the covariance matrix is proposed by incorporating the flexible Dirichlet (FD). In addition to identifying communities, the new model enables the detection of enterotypes, i.e., clusters of samples with similar microbe composition. A computationally efficient collapsed Gibbs sampler is proposed that exploits the conjugacy of the FD distribution with respect to the multinomial model. A simulation study demonstrates the model's ability to recover the true parameter values and the correct number of communities. Moreover, an application to the COMBO dataset highlights its effectiveness in detecting biologically significant communities and enterotypes, underscoring the new model as a definite improvement over LDA.