CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1746
Title: Empirical risk minimization in transductive transfer learning Authors:  Patrice Bertail - Université Paris-Nanterre and TelecomParisTech (France) [presenting]
Stephan Clemencon - Telecom ParisTech (France)
Yannick Guyonvarch - INRAE (France)
Nathan Noiry - Telecom Paris (France)
Abstract: Risk minimization problems are considered where the source distribution $P_S$ of the training observations $Z'_1,\; \ldots,\; Z'_n$ differs from the target distribution $P_T$ involved in the risk one seeks to minimize but is still defined on the same measurable space as $P_T$ and dominates it. The goal is to develop a semi-parametric framework for such a specific transfer learning problem, when auxiliary information about the target statistical population is available, in the form of expectations of known functions taken w.r.t. $P_T$. Under the assumption that the Radon-Nikodym derivative $dP_T/dP_S(z)$ belongs to a parametric class $\{g(z,\alpha):\; \alpha\in \mathcal{A}\}$, combined with suitable identifiability conditions, it is shown that a weighted empirical risk minimization (ERM) problem can be formulated with random weights, determined by finding a parameter value $\hat{\alpha}$ such that the empirical versions of the expectations aforementioned based on the $Z_i$'s equal the $P_T$-integrals. A generalization bound is established proving that, remarkably, the solution to the weighted ERM problem thus constructed achieves a learning rate of the same order, $O_{\mathbb{P}}(1/\sqrt{n})$, as that attained in absence of any sampling bias. Beyond these theoretical guarantees, numerical results provide strong empirical evidence of the relevance of the approach.