A0853
Title: Semi-supervised estimation of event rate with doubly censored survival data from electronic health records
Authors: Xuan Wang - University of Utah (United States) [presenting]
Abstract: Electronic health records (EHR) are valuable for translational research and risk prediction, especially when using clinical endpoints like time to onset of conditions. However, events often occur outside hospital systems, leading to double censoring, where precise event times are difficult to determine without manual chart reviews. Proxies, like the first diagnostic code, are commonly used but can introduce bias. The purpose is to introduce SEEDS (Semi-supervised Estimation of Event rate with Doubly-censored Survival data), a semi-supervised method that integrates small, manually labeled datasets with larger sets of surrogate data. SEEDS is consistent, asymptotically normal under certain conditions, and outperforms the supervised method and existing methods, as demonstrated through simulations and its application to estimating type 2 diabetes survival rates using EHR data from Mass General Brigham.