CMStatistics 2023: Start Registration
View Submission - CMStatistics
B0996
Title: Random forest in the spatial framework, how to deal with it? Authors:  Luca Patelli - University of Pavia (Italy) [presenting]
Michela Cameletti - Universita degli Studi di Bergamo (Italy)
Natalia Golini - University of Turin (Italy)
Rosaria Ignaccolo - University of Turin (Italy)
Abstract: When working with regression problems involving georeferenced data, it makes sense to wonder if new data-driven algorithms (developed in the Machine Learning and Statistical Learning communities) can be a valid alternative to classical models, like Kriging ones. In this context, random forest (RF) is a widely recognized algorithm adopted in various fields due to its flexibility in modeling the response-predictors relationship, even in the presence of strong non-linearities. However, when applied to spatially correlated data some concerns arise because, in the internal procedures of the RF algorithm, the independence of the observations is implicitly assumed. For this reason, it is necessary to assess if and which strategies could be used in order to use RF with spatial data. This contribution aims to face this open question by presenting and comparing some recent strategies for making the RF algorithm "spatially aware". In particular, a taxonomy will be proposed in order to categorize the most recent contributions on the topic, and, for the most relevant strategies, a simulation study with geostatistical data will be implemented.