CFE 2019: Start Registration
View Submission - CMStatistics
Title: Simultaneous SNP selection and adjustment for population structure in high dimensional prediction models Authors:  Sahir Bhatnagar - McGill University (Canada) [presenting]
Karim Oualkacha - UQAM (Canada)
Yi Yang - McGill University (Canada)
Tianyuan Lu - McGill University (Canada)
Erwin Schurr - McGill University (Canada)
JC Loredo-Osti - Memorial University (Canada)
Marie Forest - Ecole de technologie superieure ETS (Canada)
Celia Greenwood - McGill University (Canada)
Abstract: Complex traits are known to be influenced by a combination of environmental factors and rare and common genetic variants. However, detection of such multivariate associations can be compromised by low statistical power and confounded by population structure. Linear mixed effect models (LMM) can account for correlations due to relatedness, but have not been applicable in high-dimensional (HD) settings where the number of fixed effect predictors greatly exceeds the number of samples. False positives can result from two-stage approaches, where the residuals estimated from a null model adjusted for the subjects' relationship structure are subsequently used as the response in a standard penalized regression model. To overcome these challenges, we develop a general penalized LMM framework called ggmix for simultaneous SNP selection and adjustment for population structure in high dimensional prediction models. We develop a blockwise coordinate descent algorithm which is highly scalable, computationally efficient and has theoretical guarantees of convergence. Through simulations and two real data examples, we show that ggmix leads to better sensitivity and specificity compared to the two-stage approach or principal component adjustment while maintaining good predictive ability. ggmix can be used to construct polygenic risk scores and select instrumental variables in Mendelian randomization studies.