CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1488
Title: Adaptive debiased machine learning using data-driven model selection techniques Authors:  Lars van der Laan - University of Washington, Seattle (United States) [presenting]
Marco Carone - University of Washington (United States)
Alex Luedtke - University of Washington (United States)
Mark van der Laan - University of California at Berkeley (United States)
Abstract: Debiased machine learning for nonparametric inference on smooth summaries of the data distribution can suffer from instability and excessive variability. For this reason, practitioners may turn to simpler models based on semiparametric assumptions. However, this can lead to bias due to model misspecification. To address this problem, adaptive debiased machine learning (ADML) is proposed, a unifying framework combining data-driven model selection and debiased machine learning techniques to construct asymptotically linear and superefficient estimators for pathwise differentiable parameters. By learning model structure from data, ADML avoids the bias due to model misspecification and remains free from the restrictions of parametric and semiparametric models. While they may exhibit irregular behaviour for the target parameter in a nonparametric model, it is demonstrated that ADML estimators provide regular and locally uniformly valid inference for a projection-based oracle parameter. Importantly, this oracle parameter agrees with the original target parameter for distributions within an unknown but correctly specified oracle statistical submodel learned from the data. This finding implies that there is no penalty, in a local asymptotic sense, for conducting data-driven model selection compared to having prior knowledge of the oracle submodel and parameter. The theory is applied to inference on the average treatment effect in adaptive partially linear regression models.