CMStatistics 2021: Start Registration
View Submission - CMStatistics
B0352
Title: Hypothesis testing and learning on network-valued data Authors:  Debarghya Ghoshdastidar - Technical University of Munich (Germany) [presenting]
Abstract: Network analysis has evolved over the past two decades. A traditional view of a network is a tool for modelling interactions among entities of interest; for instance, the analysis of the Facebook network may focus on finding communities of users. Recent applications in bioinformatics and other areas require a perspective where the networks are the quantities of interest. Examples include classification of protein structures as enzyme or non-enzyme, or detecting if brain networks of patients with a neurological disease are statistically different from those of healthy individuals. We refer to such problems as learning from network-valued data to distinguish from the traditional network analysis problems, involving a single network of interactions. There has been considerable research in supervised learning on network-valued data, with the two most powerful tools being graph kernels and graph neural networks. We focus on two problems beyond the supervised setting: hypothesis testing of large graphs, and clustering network-valued data. A key challenge in such problems is the scarcity of data -- typically, one has access to few large graphs, and so popular approaches are not known to work well. We will discuss approaches for network testing and clustering based on ideas from high-dimensional statistics, random graphs and graphons. We will discuss some theoretical properties of these methods (statistical consistency and minimax rates) and demonstrate their empirical performance.