EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0557
Title: Two-part predictive modeling for COVID-19 deaths in the U.S. Authors:  Xiyue Liao - SDSU (United States) [presenting]
Abstract: COVID-19 prediction has been essential in the prevention and control of the disease. The motivation of this case study is to develop predictive models for COVID-19 deaths based on a cross-sectional data set with a total of 28,955 observations and 18 variables, which is compiled from 5 data sources from Kaggle. A two-part modeling framework, in which the first part is a binary logistic classifier and the second part includes machine learning or statistical smoothing methods, is introduced to model the highly skewed distribution of COVID-19 deaths. The aim is to understand what factors are most relevant to COVID-19's occurrence and fatality. Evaluation criteria such as root mean squared error (RMSE) and mean absolute error (MAE) are used. It is found that the two-part XGBoost model performs best in predicting the entire distribution of COVID-19 deaths. The most important factors relevant to COVID-19 deaths include population, the rate of primary care physicians, etc.