EcoSta 2023: Start Registration
View Submission - EcoSta2023
A0326
Title: Supervised stratified subsampling: An approach to big data predictive analytics Authors:  Ming-Chung Chang - Academia Sinica (Taiwan) [presenting]
Abstract: Predictive analytics encompasses the use of statistical models for prediction. Its power, however, is hindered by the rising amounts of data in recent years. Owing to advanced technology, big data are ubiquitous across disciplines. Such data richness may yield difficulties in predictive analytics either in terms of time cost or numerical stability. A new subsampling approach is introduced to overcome this difficulty for regression problems. The proposed method integrates a nonparametric regression technique and stratified sampling, referred to as supervised stratified subsampling. Theoretical properties are developed to justify this method. Numerical studies show that the proposed method yields good predictions and is against model misspecification.