COMPSTAT 2016: Start Registration
View Submission - COMPSTAT
A0330
Title: Simultaneous principal and network components models for integration of multiset data Authors:  Pia Tio - University of Amsterdam and Tilburg University (Netherlands) [presenting]
Abstract: With increasingly more sophisticated instruments and growing interdisciplinary cooperation, more and more huge datasets are being gathered that contain, for the same individuals, detailed information coming from multiple data sources. Given their heterogeneity and high-dimensionality, relevant information is hidden in a bulk of irrelevant variables and there is a high risk of finding (spurious) associations by coincidence. To take full advantage of the availability of multiset data, mature data integration techniques that yield interpretable results derived from a massive amount of data are needed. While data reduction techniques can deal with some of these obstacles, the interpretation of their resulting components/factors is difficult. We propose that the combined efforts of component and network analysis can deal with these obstacles simultaneously. Component analysis techniques such as sparse simultaneous component analysis are especially useful to extract subtle common processes amongst dominant source-specific variation while at the same time reducing the dimensionality of the data. Network analysis provides means to investigate unique processes amongst mutually interacting components/variables within a source without having to resort to latent variables. We present results from simulations showing the performance of the combination of the component and network analyses.