EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0741
Title: Optimization of the regularized Dirichlet multinomial regression and its application in compositional data analysis Authors:  Zeny Feng - University of Guelph (Canada) [presenting]
Alysha Cooper - University of Guelph (Canada)
Ayesha Ali - University of Guelph (Canada)
Lorna Deeth - University of Guelph (Canada)
Tim Arciszewski - Alberta Environment and Parks (Canada)
Abstract: Compositional data measured as taxonomical counts are prevalent in many biological fields, including ecology and microbiology. In ecology, samples of benthic macroinvertebrates taken from different aquatic sites were classified into taxonomic ranks based on phylum, class, order, family, genus, and species. At a given rank, the taxa counts of species conditional on the total counts can be modelled by the Dirichlet multinomial (DM) distribution, which can accommodate the multinomial over-dispersion. The model fitting in the presence of covariates can be challenging because the DM distribution falls outside the exponential family, and the number of parameters is as high as pxD, where p is the number of covariates and D is the number of taxa. With these challenges, a sparse group LASSO is proposed in the regularized DM regression. An MM-algorithm is formulated to optimize the penalized DM regression likelihood. The proposed method will be applied to identify the associations between water variables and the composition of benthic macroinvertebrates using the data collected from the Oil Sand Region in Alberta, Canada.