CMStatistics 2023: Start Registration
View Submission - CMStatistics
B0530
Title: How to accommodate uncertain observations and data quality in species distribution modeling using point process models Authors:  Emy Guilbault - University of Helsinki (Finland) [presenting]
Ian Renner - University of Newcastle (Australia)
Eric Beh - University of Wollongong (Australia)
Michael Mahony - School of Environmental and Life Sciences - The University of Newcastle (Australia)
Abstract: Various statistical models aim to produce species distribution models that better predict where species occur as a function of the environment. However, many practical challenges arise with observations coming from opportunistic surveys in terms of data quality and sampling bias. Species identification can be misleading given taxonomy changes rendering older records confusing. Other than cleaning datasets with missing information, little else is typically done in SDMs to account for misspecification. Additionally, observers tend to favour certain areas due to accessibility or a priori knowledge, thus collecting data not representative of the true species distribution. These practices can lead to both missing information and thus incomplete predictions. Two new tools are proposed to overcome these shortages to fit multi-species presence-only information models with partial species identification. Using a combination of a point process model framework with mixture modelling or machine learning approaches, incomplete labelling iteratively is accommodated while also incorporating sampling bias correction, sample size and addressing potential model over-fitting via lasso-type penalties. Both simulation studies and an application work are used on the Australian frogs' Mixophyes to evaluate the model's capabilities and limits. Tools offer new avenues for incorporating data of various quality in ecology and conservation.