A0934
Title: Spatial multivariate trees for integrating geospatial data from multiple sources
Authors: Michele Peruzzi - University of Michigan (United States) [presenting]
David Dunson - Duke University (United States)
Abstract: High-resolution geospatial data are challenging because standard geostatistical models based on Gaussian processes are known to not scale to large data sizes. While progress has been made towards methods that can be computed more efficiently, considerably less attention has been devoted to methods for large scale data that characterize complex relationships between several outcomes recorded at high resolutions by different sensors or data sources. In these settings, popular coregionalization models along with assumptions of conditional independence across spatial neighbors may be inappropriate when the spatial resolution from one data source is much lower than others. Our spatial multivariate trees (SpamTrees) are based on conditional independence assumptions on latent random effects based on a treed directed acyclic graph. SpamTrees can be interpreted as a multiscale method for multivariate data in which outcomes that are more sparsely observed are placed at tree heights corresponding to coarser scales. Information-theoretic arguments and considerations on computational efficiency guide the construction of the tree and the related efficient sampling algorithms in these imbalanced settings. We illustrate SpamTrees using a large climate data set which combines high-resolution satellite data with sparsely observed land-based station data.