CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0719
Title: To link or not to link: Estimating long-run treatment effects from historical data Authors:  Francis DiTraglia - University of Oxford (United Kingdom) [presenting]
Ezra Karger - Federal Reserve Bank of Chicago (United States)
Camilo Garcia Jimeno - Federal Reserve Bank of Chicago (United States)
Abstract: A fundamental challenge in empirical research is the reliance on datasets linked with error. Researchers must often match a dataset containing treatment status to a separate dataset with outcome measures, typically relying on non-unique information such as names and demographic characteristics. This imperfect linking process raises serious concerns about statistical efficiency, measurement error, and sample selection. For instance, because nearly all married women in the 1900s changed their surname upon marriage, most research in economic history using linked data excludes women entirely, leaving many important historical questions about them unanswered. A unified method is developed for precisely estimating long-run treatment effects using information from two datasets without explicitly constructing a linked dataset or discarding observations. The approach uses available linking covariates as efficiently as possible, allowing for measurement error in these variables and heterogeneous treatment effects. The method nests the typical approach of using unique matches but extends it to cases where such matches are not universally available. To demonstrate the practical implications of the methodology, it is used to revisit research on compulsory schooling laws and the inter-generational effects of slave-holding on the wealth of slave-owning families.