CMStatistics 2016: Start Registration
View Submission - CMStatistics
B1467
Title: Logistic regression augmented community detection with application in identifying autism-related gene pathways Authors:  Qing Pan - George Washington University (United States) [presenting]
Yunpeng Zhao - George Mason University (United States)
Chengan Du - George Mason University (United States)
Abstract: When searching for gene pathways leading to specific disease outcomes, we propose to take advantage of additional information on gene characteristics to differentiate genes of interests from irrelevant background ones when connections involving both types of genes are observed and their relationships to the disease are unknown. Novel generalized stochastic blockmodel are proposed that singles out irrelevant background genes with the help of auxiliary information, and clusters relevant genes into cohesive groups using the adjacency matrix. Expectation-maximization algorithm is modified to maximize a joint pseudo-likelihood assuming latent indicators for relevance to the disease and latent group memberships as well as Poisson or multinomial distributed link numbers within and between groups. Asymptotic consistency of label assignments are proven. Superior performance and robustness in finite samples are observed in simulation studies. The proposed method identifies previously missed gene sets underlying autism and related neurological diseases using diverse data sources including de novo mutations, gene expression and protein-protein interactions.