CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1056
Title: Spectral regularized kernel two-sample test Authors:  Bharath Sriperumbudur - Pennsylvania State University (United States) [presenting]
Omar Hagrass - Pennsylvania State University (United States)
Bing Li - The Pennsylvania State University (United States)
Abstract: Over the last decade, an approach that has gained a lot of popularity to tackle non-parametric testing problems on general (i.e., non-Euclidean) domains is based on the notion of reproducing kernel Hilbert space (RKHS) embedding of probability distributions. The main goal is to understand the optimality of two-sample tests constructed based on this approach. First, it is shown that the popular MMD (maximum mean discrepancy) two-sample test is not optimal in terms of the separation boundary measured in Hellinger distance. Second, a modification to the MMD test is proposed based on spectral regularization by taking into account the covariance information (which is not captured by the MMD test) and the proposed test is proven to be minimax optimal with a smaller separation boundary than that achieved by the MMD test. Third, an adaptive version of the above test is proposed which involves a data-driven strategy to choose the regularization parameter and show the adaptive test to be almost minimax optimal up to a logarithmic factor. Moreover, the results hold for the permutation variant of the test where the test threshold is chosen elegantly through the permutation of the samples. Through numerical experiments on synthetic and real-world data, the superior performance of the proposed test in comparison to the MMD test is demonstrated.