CMStatistics 2020: Start Registration
View Submission - CMStatistics
B0393
Title: ConQuR: Batch effect correction for microbiome data via conditional quantile regression Authors:  Wodan Ling - Fred Hutchinson Cancer Research Center (United States) [presenting]
Michael C Wu - Fred Hutchinson Cancer Research Center (United States)
Abstract: Mega-analysis by integrating batches of data boosts the power to detect associations between microbiome data and clinical variables of interest. However, as with other high-throughput data, microbiome data can suffer from severe batch effects, which simultaneously leads to excessive false positives and false negatives. Most of the existing strategies for mitigating batch effects in microbiome data rely on approaches originally designed for genomic analysis. Many of them assume Gaussian linear or negative binomial regression models, which fail to adequately address the severe zero-inflation, dispersion and heterogeneity issues in microbiome data. The other strategies tailored for microbiome data can only be used for association testing, which fails to allow other common analytic goals such as visualization. Moreover, some of them require particular types of controls/spike-ins, making them not applicable to different designs. We developed ConQuR, a batch correction method, which uses a two-part quantile regression model to consider both inflated zeros and complex distributional attributes of the non-zero measures. It preserves the zero-inflated integer nature of microbiome data, which is compatible with any subsequent microbiome normalization and analysis. We applied ConQuR to several real data sets and showed that it outperforms the existing methods in removing batch effects and boosting the power to detect associations from the data.