CFE-CMStatistics 2025: Start Registration
View Submission - CFE-CMStatistics 2025
A0525
Title: The cost of ignoring space: Bias and overconfidence in model calibration Authors:  Michele Nguyen - Nanyang Technological University (Singapore) [presenting]
Maricar Rabonza - Nanyang Technological University (Singapore)
David Lallemant - Nanyang Technological University (Singapore)
Abstract: Data with spatial information are collected through active and passive methods such as field surveys and sensor networks. These are used for calibrating models in fields like environmental science, ecology, urban planning, and public health. Despite Tobler's first law of geography, which states that nearby observations tend to be more similar, spatial dependence is sometimes ignored during modeling or omitted by using cost functions that treat observations as independent. An issue in model calibration with spatial data is highlighted: When spatial dependence is present, one may be overconfident in estimated parameters, and in worst cases, face systematic bias. This is evident when there is spatial clustering of observations, and extreme cases see complications from unbalanced data. This is demonstrated through simulation experiments with a simple quadratic model where one dataset has equidistant observations while the other has clustered observations and an isolated point. Next, spatial weights are developed based on spatial conditional information, and it is shown that weighting serves as a middle ground between explicit spatial modeling and omitting spatial dependence. This is illustrated using case studies of heavy metal soil contamination and malaria prevalence. The need for further development of computationally efficient methods to address the complex interplay between clustering, spatial dependence, and model parameters is highlighted.