Title: Cross-validation Bayes factors for the nonparametric two-sample test
Authors: Jeff Hart - Texas AM University (United States) [presenting]
Naveed Merchant - Texas AM University (United States)
Taeryon Choi - Korea University (Korea, South)
Abstract: Given independent random samples from densities $f$ and $g$, a fundamental problem is testing equality of $f$ and $g$. We define Bayes factors that utilize data splitting to test this hypothesis. Two models are considered: one, $M_1$, that assumes the densities are the same, and a second, $M_2$, that allows $f$ and $g$ to be different. Each data set is split into two parts, training and validation. Three kernel density estimates (KDEs) are computed from the training data, and the models $M_1$ and $M_2$ are defined in terms of these kdes. A marginal likelihood for each model is then computed from the validation data, and the Bayes factor is the ratio of the two marginal likelihoods. The relative simplicity of this method in comparison to existing nonparametric Bayes procedures is emphasized. Only three parameters are involved in the proposed method, these being the bandwidths of the three KDEs. Appropriate priors for the bandwidths are proposed, and the importance of choosing a good kernel for the KDEs is discussed. In particular, relatively heavy-tailed kernels should be used to guarantee good performance of the Bayes factors in a variety of settings.