COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0268
Title: From small scales to large scales: Distance-to-measure density based geometric analysis of complex data Authors:  Katharina Proksch - University of Twente (Netherlands) [presenting]
Christoph Weitkamp - University of Goettingen (Germany)
Thomas Staudt - University of Goettingen (Germany)
Christophe Zimmer - Institut Pasteur (France)
Benoit Lelandais - Institut Pasteur (France)
Abstract: The analysis and classification of complex point clouds are considered. We focus on the task of identifying differences between noisy point clouds based on small-scale characteristics while disregarding large-scale information. We propose an approach based on a transformation of the data via the so-called Distance-to-Measure (DTM) function, a transformation which is based on the average of nearest neighbour distances. For each data set, we estimate the probability density of average local distances of all data points and use the estimated densities for classification. While the applicability is immediate and the practical performance of the proposed methodology is very good, the theoretical study of the density estimators is quite challenging, as they are based on i.i.d. observations that have been obtained via a complicated transformation. In fact, the transformed data are stochastically dependent in a non-local way that is not captured by commonly considered dependence measures. Nonetheless, we show that the asymptotic behaviour of the density estimator is driven by a kernel density estimator of certain i.i.d. random variables by using theoretical properties of U-statistics, which allows us to handle dependencies. We show via a numerical study and in an application to simulated single molecule localization microscopy data of chromatin fibers that unsupervised classification tasks based on estimated DTM-densities achieve excellent separation results.