COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0466
Title: Mixed high-dimensional network inference via the Gaussian copula Authors:  Ekaterina Tomilina - INRAE (France) [presenting]
Gildas Mazo - INRAE (France)
Florence Jaffrezic - INRAE (France)
Abstract: Large-scale heterogeneous data integration for network inference is a key methodological challenge, especially in the context of multi-omic data analysis. A novel procedure is proposed based on the copula theory, which allows the joint analysis of data of various types (continuous, discrete, etc.) The proposed estimation procedure is semi-parametric and, therefore, does not require any explicit assumption concerning the marginal distributions of the data, which offers great flexibility for the analysis of biological data, which may not exactly follow any pre-specified parametric distribution. A theoretical proof is also presented, showing the equivalence between block-wise independence in the copula correlation matrix and the actual data correlation structure. In an extensive simulation study, the proposed estimation procedure is shown, based on a pairwise-pseudo-likelihood approach, was able to accurately estimate the copula correlation matrix, even for a quite large number of variables (several hundred)and a quite small number of replicates (several dozens). The proposed method was also applied to a real ICGC dataset on breast cancer.