A1278
Title: Learning and optimizing across large-scale experimentation programs
Authors: Simon Ejdemyr - Netflix (United States) [presenting]
Abstract: Firms with mature A/B testing infrastructure run thousands of randomized experiments annually. Leveraging such data requires statistical methods that enable learning across weak, heterogeneous treatment effects. Several recent advances are presented. First, a Bayesian hierarchical model estimates cumulative returns from related experiments, supporting strategic inference about innovation programs. Second, experimentation is reframed as an optimization problem: using dynamic programming to tune decision rules -- such as $p$-value thresholds -- improves decision quality across test portfolios. Third, decision rules are evaluated via a cross-validation estimator of cumulative returns that corrects for noise-induced bias in plug-in estimates. Finally, meta-analysis techniques inspired by weak instruments -- such as LIML and JIVE -- enable construction of proxy metrics, which play a central role in optimizing returns when primary outcomes are noisy or delayed. Applied to historical experiments, these methods yield more reliable estimates of program value, improved decision strategies, and better long-term outcome targeting under uncertainty. Together, they demonstrate how econometric tools can support principled decision making in complex experimentation ecosystems.