COMPSTAT 2023: Start Registration
View Submission - COMPSTAT2023
A0324
Title: Efficient nonparametric two-sample testing with the maximum mean discrepancy Authors:  Dean Bodenham - Imperial College London (United Kingdom) [presenting]
Yoshinobu Kawahara - Osaka University / RIKEN (Japan)
Abstract: Many good nonparametric two-sample tests for univariate data exist, such as the Kolmogorov-Smirnov, Cramer-von Mises and Wilcoxon-Mann-Whitney tests. The maximum mean discrepancy (MMD) test is a nonparametric kernelised two-sample test that, when using a characteristic kernel, can detect any distributional change between two samples. It is defined for multivariate data, and when the total number of $d$-dimensional observations is $n$, direct computation of the test statistic is $O(dn^2)$. While approximations with lower computational complexity are known, more efficient methods for computing the exact test statistic are unknown. First, an exact method is described for computing the MMD test statistic for the univariate case in $O(n \log n)$ using the Laplacian kernel, and will compare its performance to classic univariate tests, highlighting cases where the MMD is more powerful. Moreover, this approach can be modified to create a $O(n \log n)$ algorithm for exactly computing the MMD statistic for bivariate data. For higher dimensions, the exact univariate method is extended to an approximate method, also with complexity log-linear in the number of observations. Experiments show that this approximate method can have good statistical performance when compared to the exact test, particularly in cases where $d > n$.