EcoSta 2024: Start Registration
View Submission - EcoSta2024
A1112
Title: High-dimensional learning for multi-sourced matrix data Authors:  Chenglong Ye - University of Kentucky (United States) [presenting]
Abstract: The abundance of data sources has made it possible to learn user preferences for products from various user-product interactions. In contrast to existing literature that models the differences between the regression coefficient means, we explore the setting when covariates are absent or hard to access due to privacy concerns. We treat user-product preferences as a partially observed main matrix using the primary data and then introduce a matrix transfer learning algorithm that leverages low-rank matrix estimation techniques to facilitate the transfer. The proposed method utilizes one or multiple auxiliary datasets to help predict the unobserved values of the main matrix. It outperforms existing covariate-free methods in both synthetic and real data settings. Theoretically, we derive an upper bound for the prediction error of our proposed approach to explain the interplay of the difference between the main and auxiliary data and their sample sizes. We further model the heterogeneous treatment effect estimation from the panel data in the causal inference setting as a transfer learning task, and adapt our algorithm to impute such effects. We benchmark our performance to existing matrix-completion-based algorithms and show the benefits of using ours.