EcoSta 2023: Start Registration
View Submission - EcoSta2023
A0805
Title: ME-Bayes SL: Enhanced Bayesian polygenic risk prediction leveraging information across multiple ancestry groups Authors:  Jin Jin - University of Pennsylvania (United States) [presenting]
Jianan Zhan - 23andMe (United States)
Jingning Zhang - Johns Hopkins University (United States)
Ruzhang Zhao - Johns Hopkins University (United States)
Jared O Connell - 23andMe (United States)
Yunxuan Jiang - 23andMe (United States)
23andMe Research Team - 23andMe (United States)
Steven Buyske - Rutgers University (United States)
Christopher Gignoux - University of Colorado Anschutz Medical Campus (United States)
Christopher Haiman - University of Southern California Keck School of Medicine (United States)
Eimear Kenny - Icahn School of Medicine at Mount Sinai (United States)
Charles Kooperberg - Fred Hutchinson Cancer Research Center (United States)
Kari North - The University of North Carolina at Chapel Hill Department of Epidemiology (United States)
Bertram Koelsch - 23andMe (United States)
Genevieve Wojcik - Johns Hopkins University Department of Epidemiology (United States)
Haoyu Zhang - National Cancer Institute (United States)
Nilanjan Chatterjee - Johns Hopkins University (United States)
Abstract: Polygenic risk scores (PRS) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across different populations. ME-Bayes SL, a method for the ancestry-specific polygenic prediction that borrows information in the summary statistics from genome-wide association studies (GWAS) across multiple ancestry groups, is proposed. ME-Bayes SL conducts Bayesian hierarchical modelling under a multivariate spike-and-slab model for effect-size distribution and incorporates an ensemble learning step to combine information across different tuning parameter settings and ancestry groups. ME-Bayes SL shows promising performance compared to alternatives in the simulation studies and data analyses of 16 traits across four distinct studies, totalling 5.7 million participants with substantial ancestral diversity. The method, for example, has an average gain in prediction R2 across 11 continuous traits of 40.2\% and 49.3\% compared to PRS-CSx and CT-SLEB, respectively, in the African Ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, underlying trait architecture, and the choice of reference samples for LD estimation, and thus ultimately, a combination of methods may be needed to generate the most robust PRS across diverse populations.