EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0395
Title: A lasso approach to covariate selection and average treatment effect estimation for RCTs Authors:  Peter Schochet - Mathematica Inc. (United States) [presenting]
Abstract: Covariates are often used to improve power for estimating average treatment effects (ATEs) for randomized controlled trials (RCTs). Covariate pre-specification is often recommended as it maintains the Type 1 error rate in repeated sampling but is not required by major RCT registries and clearinghouses across fields. Thus, many studies identify predictive covariates once primary outcomes have been collected. These post hoc methods, however, can suffer from a lack of transparency and replicability. An approach from a recent study that develops Lasso machine learning methods for the post-hoc selection of covariates for RCTs that can address these issues is discussed. The approach involves pre-specifying a fully replicable process for selecting covariates. The focus is on two-stage estimators, where the first stage involves Lasso estimation, and the second stage involves adjusting regression-based ATE estimators for covariates using the first-stage Lasso results. The design-based approach is nonparametric, applies to continuous, binary, and discrete outcomes, pertains to clustered and non-clustered RCTs, and can be easily implemented using existing software. The $l_1$ consistency of the estimated Lasso coefficients, a finite population central limit theorem for the ATE estimators, and design-based variance estimation are discussed. Simulations suggest good statistical performance in real-world settings.