EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0998
Title: Semiparametric maximum likelihood estimation with two-phase stratified case-control sampling Authors:  Yaqi Cao - Minzu University of China (China) [presenting]
Ying Yang - Tsinghua University (China)
Jinbo Chen - University of Pennsylvania (United States)
Abstract: In the two-phase stratified case-control sampling design, some covariates are available only for a subset of cases and controls, which are selected based on the outcome and fully collected covariates. The analysis often focuses on fitting a logistic regression model to describe the relationship between the outcome and all covariates. The interest also lies in characterizing the distribution of incomplete covariates conditional on fully observed ones in the underlying population, which is required for quantifying the predictive accuracy of the fitted model. It is desirable to include all subjects in the analysis to achieve consistency and efficiency of parameter estimation. A novel semiparametric maximum likelihood approach is proposed under rare disease assumption, where estimates are obtained through a novel reparametrized profile likelihood technique. The large sample theory is developed for the proposed estimator, showing through simulation that it has improved efficiency compared with the existing approach. The method is applied to the Breast Cancer Detection and Demonstration Project data, where one risk predictor, breast density, was measured only for a subset of study women.