Title: Manifold valued data analysis of samples of networks, with applications in corpus linguistics
Authors: Katie Severn - University of Nottingham (United Kingdom) [presenting]
Ian Dryden - University of Nottingham (United Kingdom)
Simon Preston - University of Nottingham (United Kingdom)
Abstract: Networks can be used to represent many systems such as text documents and brain activity, and it is of interest to develop statistical techniques to compare networks. A general framework is developed for extrinsic statistical analysis of samples of networks, motivated by networks representing text documents in corpus linguistics. Networks are identified by their graph Laplacian matrices, for which metrics, embeddings, tangent spaces, and a projection from Euclidean space to the space of graph Laplacians are defined. This framework provides a way of computing means, performing principal component analysis and regression, and performing hypothesis tests, such as for testing for equality of means between two samples of networks. The methodology is applied to the set of novels by Jane Austen and Charles Dickens.