A1610
Title: Distilling causal effects: Stable subgroup estimation via distillation trees in causal inference
Authors: Ana Kenney - University of California, Irvine (United States) [presenting]
Tiffany Tang - University of California, Berkeley (United States)
Melody Huang - Yale University (United States)
Abstract: Researchers are interested in understanding the underlying treatment effect heterogeneity. While recent methodological developments have introduced new black-box approaches to better estimate heterogenous treatment effects, these methods only provide an estimate of the individual-level treatment effect and fall short of characterizing the underlying individuals who may be most at risk or benefit most from receiving the treatment. A method, causal distillation trees (CDT), is introduced that allows researchers to estimate interpretable subgroups in their studies stably. CDT allows researchers to fit any machine learning model of their choice to estimate the individual-level treatment effect and then leverages a simple, second-stage tree-based model to distil the estimated treatment effect into meaningful subgroups. As a result, CDT inherits the theoretical guarantees from black-box machine learning models, while preserving the interpretability of a simple decision tree. The stability of CDT is theoretically characterized by estimating substantively meaningful subgroups, and helpful diagnostics are provided for researchers to evaluate the quality of the estimated subgroups. The method is empirically demonstrated via extensive simulations and a case study evaluating the impact of canvassing on voter turnout. It is shown that CDT out-performs state-of-the-art approaches in identifying interpretable subgroups.