EcoSta 2024: Start Registration
View Submission - EcoSta2024
A0777
Title: Subsample size determination with different approaches Authors:  Sheng Zhang - Indiana University Purdue University at Indanapolis (United States) [presenting]
Abstract: Motivated by subsampling in the analysis of big data and by data-splitting in machine learning, sample size determination for multidimensional parameters is studied with the traditional normal approximation approach. A novel approach is also proposed to the construction of confidence intervals based on concentration inequalities with the missing factors, and by applying reversely, the approach can be used to determine the sub-sample size for big data analysis. Improved concentration inequalities are derived by providing the missing factors, and the results are applied to estimate the tail probability of certain random sums. The formula for confidence interval is provided, and the simulation results are reported.