EcoSta 2023: Start Registration
View Submission - EcoSta2023
A1198
Title: Causal clustering Authors:  Kwangho Kim - Korea University (Korea, South) [presenting]
Edward Kennedy - Carnegie Mellon University (United States)
Larry Wasserman - Carnegie Mellon University (United States)
Abstract: Causal effects are often characterized by population effects, which can give an incomplete picture when the treatment effect within subpopulations varies considerably from the population effect. As the subgroup structure is usually unknown, identifying and estimating subpopulation effects is relatively more challenging than population-level effects. Causal Clustering, a new set of methods for exploring the heterogeneity of treatment effects, leveraging tools from clustering analysis, is developed. First, an efficient way is developed to uncover subgroup structure by harnessing widely-used clustering methods. Specifically, it is shown that k-means, density-based, and hierarchical clustering algorithms can be successfully adopted into our framework via plug-in type estimators. Next, for the k-means causal clustering, a specially bias-corrected estimator based on nonparametric efficiency theory is developed, which attains fast convergence rates and asymptotic normality to the true cluster centres under weak nonparametric conditions. This requires novel techniques due to the particular form of the non-smooth k-means risk. Novel tools especially useful for modern outcome-wide studies with many treatment levels are derived. Importantly, it is also discussed how these methods can be extended to clustering with generic pseudo-outcomes: e.g., partially observed outcomes or unknown functionals. Finally, the methods are illustrated via simulation studies and real data analyses.