Title: Asymptotically distribution-free change-point detection for multivariate and non-Euclidean data
Authors: Lynna Chu - Iowa State University (United States) [presenting]
Hao Chen - University of California at Davis (United States)
Abstract: The focus is on testing and estimation of change-points, locations where the distribution abruptly changes, in a sequence of multivariate or non-Euclidean observations. While the change-point problem has been extensively studied for low-dimensional data, advances in data collection technology have produced data sequences of increasing volume and complexity. Motivated by the challenges of modern data, we study a non-parametric framework that can be effectively applied to various data types as long as an informative similarity measure on the sample space can be defined. The existing approach along this line has low power and/or biased estimates for change-points under some common scenarios. To address these problems, we present new tests based on similarity information that exhibit substantial improvements in detecting and estimating change-points. In addition, under some mild conditions, the new test statistics are asymptotically distribution free under the null hypothesis of no change. Analytic p-value approximation formulas to the significance of the new test statistics are derived, making the new approaches easy off-the-shelf tools for large datasets. The effectiveness of the new approaches are illustrated in an analysis of New York taxi data.