A1306
Title: Machine learning and spatiotemporal statistics in the ocean: Fusing data sources and making nonlinear predictions
Authors: Adam Sykulski - Imperial College London (United Kingdom) [presenting]
Abstract: The ocean is observed through a variety of means, including from satellites (remotely) and from instruments deployed at sea (in-situ). Sometimes, these measurements agree, albeit with different observational noise and sampling characteristics, but sometimes, they measure fundamentally different but related quantities. Reconciling these data sources is therefore desirable but poses a significant challenge, and how best to do it depends on the task at hand. Two examples of how machine learning and spatiotemporal statistics can be used are presented to fuse heterogeneous data sources in a nonlinear sense to make better predictions. The first example attempts to predict ocean surface velocities using satellite data only but trained using floating instruments called drifters. A probabilistic prediction framework that is implemented using a novel multivariate natural gradient boosting approach is used. The second example fuses satellite and drifter data to predict the abundance of Antarctic Krill in the Southern Ocean. As the data is "zero-inflated" in that krill tend to swarm and can otherwise be absent from large regions of the ocean, a Hurdle Gamma model is employed to first model presence-absence and then predict the abundance, all the time using many covariates from satellites and instruments at sea to improve predictions in a nonlinear sense.