A0377
Title: Efficient transfer learning in partially observable offline data via causal bounds
Authors: Wei You - HKUST (Hong Kong) [presenting]
Xueping Gong - HKUST (Hong Kong)
Jiheng Zhang - HKUST (Hong Kong)
Abstract: Transfer learning accelerates learning by leveraging knowledge from related source agents, yet handling heterogeneous data poses challenges due to biased causal effect estimates. Transfer learning is explored in partially observable contextual bandits, where agents have incomplete information and hidden confounders. To tackle the non-identifiability caused by unobserved confounders, tight causal bounds are derived through optimization. Functional constraints are discretized into linear forms, and compatible causal models are sampled by sequentially solving linear programs, incorporating estimation errors. The approach exhibits robust convergence, providing reliable causal bounds that improve classical bandit algorithms, achieving tighter regret bounds relative to the action set and function space sizes. For general context spaces requiring function approximation, the method significantly enhances dependency on function space size compared to prior work. It is formally proven that the causally enhanced algorithms outperform classical methods, achieving faster convergence rates. An example of offline pricing policy learning with censored demand demonstrates the method's practical benefits. Simulations confirm superiority over state-of-the-art methods, highlighting the potential in applications with scarce, costly, or privacy-restricted data.