A0200
Title: Multiway-SIR for longitudinal multi-table data integration
Authors: Valerie Sautron - INRA (France)
Marie Chavent - Bordeaux University (France)
Nathalie Viguerie - INSERM (France)
Nathalie Villa-Vialaneix - INRA (France) [presenting]
Abstract: An extension of DUAL-STATIS to the sliced-inverse regression (SIR) framework is proposed to analyze multi-table datasets with respect to a numeric variable of interest. The method is designed to analyze the case where a data set $\mathbf{X}$, which corresponds to a set of $p$ variables measured $T$ times on the same $n$ subjects is related to a real target variable $\mathbf{y}$, measured on the same $n$ subjects. The approach is an exploratory method which aims at understanding the evolution of the relation between $\mathbf{X}$ and $\mathbf{y}$ through time. The method proceeds in two steps: 1) an inter-structure analysis studies the resemblance between the different time steps by computing similarities between estimates of the covariance of the mean of $\mathbf{X}_{..t}$ conditional to $\mathbf{y}$. Similarly to SIR, the conditional expectation is estimated by slicing the range of $\mathbf{y}$. The result of this analysis is a compromise covariance matrix $\Gamma^c$, which captures a compromise correlation structure of $\mathbb{E}(\mathbf{X}_{..t}|y)$ over $t$; 2) an intra-structure analysis which is a generalized PCA of the compromise. This second step results in graphical outputs which can be used to explore the covariance structure between variables and time steps conditional to $\mathbf{y}$. The method is illustrated on a real problem related to the consequences of a low calorie diet on obese persons in which the target variable of interest is the weight gain/loss.