EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0247
Title: Post-estimation strategies in high-dimensional data analytics Authors:  Ejaz Ahmed - Brock (Canada) [presenting]
Abstract: The rapid growth in the size and scope of data sets in a host of disciplines has created a need for innovative statistical strategies to understand such data. A variety of statistical and machine learning is needed to reveal the hidden data story. Complex big data analysis is a very challenging but rewarding research area as data sets include a larger number of features, data contamination, unstructured patterns, and so on. A host of models are now data-driven with a large number of predictors, namely high-dimensional data (HDD). For HDD analysis, many penalized methods were introduced for simultaneous variable selection and parameter estimation when the model is sparse. However, a model may have sparse signals as well as a number of predictors with weak signals. In this scenario, variable selection methods may not distinguish predictors from weak signals and sparse signals. For this reason, a high-dimensional shrinkage strategy is proposed to improve the prediction performance of a submodel. The proposed high-dimensional shrinkage strategy is demonstrated to perform uniformly better than the penalized and machine learning methods in many cases. The relative performance of the proposed HDSE strategy is appraised by both simulation studies and real data analysis. Some open research problems are discussed as well.