Title: Empowering association tests using unpaired data
Authors: Kayhan Batmanghelich - University of Pittsburgh (United States) [presenting]
Abstract: There is a growing interest in the biomedical research community to incorporate retrospective data, available in healthcare systems, to shed light on associations between different biomarkers. Understanding the association between various types of biomedical data, such as genetic, blood biomarkers, imaging, etc provides a holistic understanding of human diseases. To test the association hypothesis between two types of data in Electronic Health Records, one requires a substantial sample size with both data modalities to achieve a reasonable power. Current methods only allow using data from individuals who have both data modalities. Hence, researchers cannot take advantage of much larger samples in EHR that have at least one of the data types, which limits the power of the association test. We present a new method called the Semi-paired Association Test (SAT) that makes use of both paired and unpaired data. In contrast to classical approaches, incorporating unpaired data allows the SAT to produce better control of false discovery and to improve the power of the association test. We study the properties of the new test theoretically and empirically, through a series of simulations and by applying our method on real studies in the context of chronic disease. We are able to identify an association between the high-dimensional characterization of CT chest images and several blood biomarkers as well as the expression of dozen of genes involved in the immune system.