EcoSta 2021: Start Registration
View Submission - EcoSta2021
A0738
Title: Causal inference in high dimensions without sparsity Authors:  Steve Yadlowsky - Google Research (United States) [presenting]
Abstract: The focus is on the problem of estimating the average treatment effect in the presence of fully observed, high dimensional confounding variables, where the number of confounders $d$ is of the same order as the sample size $n$. To make the problem tractable, we posit a generalized linear model for the effect of the confounders on the treatment assignment and outcomes but do not assume any sparsity. Instead, we only require the magnitude of confounding to remain non-degenerate. Despite making parametric assumptions, this setting is a useful surrogate for some machine learning methods used to adjust for confounding in two-stage methods. In particular, the estimation of the first stage adds variance that does not vanish, forcing us to confront terms in the asymptotic expansion that normally are brushed aside as finite sample defects. We compare the parametric g-formula, IPW, and two common doubly robust estimators---augmented IPW (AIPW) and targeted maximum likelihood estimation (TMLE). When the outcome model estimates are unbiased, the g-formula outperforms the other estimators in both bias and variance. Among the doubly robust estimators, the TMLE estimator has the lowest variance. Existing theoretical results do not explain this advantage because the TMLE and AIPW estimators have the same asymptotic influence function. However, our model emphasizes differences in performance between these estimators beyond first-order asymptotics.