CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1543
Title: A generalized label shift model for robust estimation: Predicting cohorts hospitalizations Authors:  Yanyuan Ma - Pennsylvania State University (United States)
Xavier de Luna - Umea University (Sweden)
Mohammad Ghasempour - Umea University (Sweden) [presenting]
Abstract: A prediction problem is tackled, where X is observed for all individuals in a birth cohort and want to predict $E(Y)$, the expectation of an unobserved variable Y for this cohort. In the motivating case study, $E(Y)$ is the cohort's average number of hospitalization days. X is a large set of covariates observed using health and administrative registers on all Swedish populations. In order to predict $E(Y)$, information from earlier cohorts is used for which both X and Y are observed for all individuals, and is aimed at obtaining robust predictions by making weak assumptions. A generalized label shift model is presented, which describes the change in the distribution $f(X|Y;k)$ for cohort k as an exponential tilt of a baseline cohort $k0$ distribution, $f(X|Y;k0)$. Identification of the model is shown under weak conditions, and semiparametric theory is used to propose an efficient influence function-based estimator of $E(Y)$. Results from a Monte Carlo study and registered hospitalization data are also presented.