A0355
Title: Statistical inference for subgraph densities under induced random sampling from network data
Authors: Nilanjan Chakraborty - Missouri University of Science and Technology (United States) [presenting]
Ayoushman Bhattacharya - Washington University in Saint Louis (United States)
Soumen Lahiri - Washington University in Saint Louis (United States)
Abstract: Statistical inference for large networks based on sampled smaller network data is an important problem in network analysis. The focus is on developing a framework for obtaining statistical guarantees for subgraph densities of a general population network under without replacement sampling (SRSWOR). Examples of such subgraph densities include edge density, triangle density, two-star density and other popularly studied graph summary statistics. Under this sampling scheme, a Berry-Esseen bound is derived to establish the asymptotic normality of the Horwitz-Thompson (HT) estimator for the population subgraph densities. The HT estimator is shown to be unbiased for population subgraph densities. To facilitate inferential procedures, a jackknife estimator of the unknown population variance is provided, and its consistency is established. The joint asymptotic normality of two subgraph densities is also established, which is crucial in establishing the asymptotic normality of the global clustering coefficient/global transitivity of the sampled graph. Results find a useful application to the problem of testing the equality of two population graphs using the subgraph densities as the test statistic. Finally, a simulation study is presented, which corroborates the theoretical findings.