View Submission

A0837

Title: Reinforcement learning for estimation and inference in heterogeneous environments Authors: Atanas Christev - Heriot-Watt University (United Kingdom) [presenting]
Abstract: Sequential decision-making is central to many social and economic phenomena. Reinforcement learning (RL) is an optimization framework that endows agents with sequential knowledge to interact and learn from an unknown environment, i.e., learn optimal policies of their own behavior in multiple steps through exploration and exploitation. Classical methods struggle to deal with the inherent heterogeneity of structural economic models. Based on inverse reinforcement learning (IRL), a novel estimation framework is proposed and developed that simultaneously clusters behavioral trajectories and infers distinct type-specific utility functions for latent groups, thereby allowing for unobserved heterogeneity. The method supports scalable estimation without explicit transition models and flexible, non-parametric rewards. Theoretically, it is shown that it exhibits oracle properties for group recovery under sufficient reward separation and asymptotic normality, enabling the construction of confidence intervals for policy parameters. The method is applied to analyze the structural representation of a frictional labor market with life cycle dynamics to account for income and risk inequality over the business cycle. Low-earners behave very differently from high-earners: initial wealth and search effort in the labor market have large implications for the large negative skewness of lifetime earnings for each of these two groups.