B1570
Title: Quantifying uncertainty of subsampling-based ensemble methods under a U-statistic framework
Authors: Qing Wang - Wellesley College (United States) [presenting]
Yujie Wei - Johns Hopkins University (United States)
Abstract: The problem of variance estimation of subsampling-based ensemble methods, such as subbagging and sub-random forest, is addressed. We first recognize that a subsampling-based ensemble can be written in the form of a U-statistic of degree k, where k is the subsample size. As a result, one can study the uncertainty of the ensemble estimator under a U-statistic framework. Motivated by previous work, we propose to construct an unbiased variance estimator for a subsampling-based ensemble, which is efficient to realize with the help of a partition-resampling scheme. We show by simulation studies that the proposed variance estimator has a significant computational advantage and yields better performance in terms of mean, standard deviation, and mean squared error compared to the benchmark under either a simple linear regression model or a MARS model. Furthermore, we present how to construct an asymptotic confidence interval of the expected response of an ensemble using the proposed variance estimator, and compare its coverage probability with competing methods. In the end, we demonstrate the practical applications of the methodology using real data examples.