A0723
Title: Data integration via analysis of subspaces (DIVAS)
Authors: Jan Hannig - University of North Carolina at Chapel Hill (United States) [presenting]
Quoc Tran-Dinh - University of North Carolina at Chapel Hill (United States)
Jack Prothero - National Institute of Standards and Technology (United States)
Andrew Ackerman - University of North Carolina at Chapel Hill (United States)
Meilei Jiang - University of North Carolina at Chapel Hill (United States)
Steve Marron - University of North Carolina at Chapel Hill (United States)
Abstract: Modern data collection in many data paradigms, including bioinformatics, often incorporates multiple traits derived from different data types (i.e. platforms). This data is called multi-block, multi-view, or multi-omics data. The emergent field of data integration develops and applies new methods for studying multi-block data and identifying how different data types relate and differ. One major frontier in contemporary data integration research is a methodology that can identify partially-shared structure between sub-collections of data types. A new approach is presented: Data integration via analysis of subspaces (DIVAS). DIVAS combines new insights in angular subspace perturbation theory with recent developments in matrix signal processing and convex-concave optimization into one algorithm for exploring partially-shared structure. Based on principal angles between subspaces, DIVAS provides built-in inference on the results of the analysis and is effective even in high-dimension-low-sample-size (HDLSS) situations.