A0854
Title: Efficient and accurate genome-wide survival association analysis controlling for sample relatedness in biobanks
Authors: Rounak Dey - Harvard University (United States) [presenting]
Abstract: With decades of electronic health records linked to genetic data, large biobanks provide unprecedented opportunities for systematically understanding the genetics of complex diseases. Genome-wide survival association analysis can identify genetic variants associated with ages of onset, disease progression, and lifespan. Apart from the obvious computational challenge that such analyses entail, statistical methods also need to adjust for unknown genetic ancestry structures and familial relatedness among the biobank participants. Further, due to the cohort-based recruitment strategy typically followed in biobanks, most phenotypes have severe heavy-censoring which can lead to extreme type I error inflation in standard asymptotic tests of no genetic effects. We developed an efficient and accurate frailty model approach for genome-wide survival association analysis of censored time-to-event (TTE) phenotypes by accounting for both population structure and relatedness. Our method utilizes state-of-the-art optimization strategies to reduce the computational cost, and the saddlepoint approximation to allow for the analysis of heavily censored phenotypes (>90\%) and low-frequency genetic variants (down to minor allele count 20). We demonstrated the performance of our method through extensive simulation studies and analysis of five TTE phenotypes, including lifespan, with heavy censoring rates (90.9\% to 99.8\%) on ~400,000 UK Biobank participants and ~180,000 individuals in FinnGen.