EcoSta 2023: Start Registration
View Submission - EcoSta2023
A0880
Title: High dimensional logistic regression under network dependence Authors:  Somabha Mukherjee - National University of Singapore (Singapore) [presenting]
Ziang Niu - University of Pennsylvania (United States)
Bhaswar Bhattacharya - University of Pennsylvania (United States)
George Michailidis - University of California, Los Angeles (United States)
Sagnik Halder - University of Florida (United States)
Abstract: The classical formulation of logistic regression relies on the independent sampling assumption, which is often violated when the outcomes interact through an underlying network structure, such as over a temporal/spatial domain or on a social network. This necessitates the development of models that can simultaneously handle both the network peer effect and the effect of high-dimensional covariates. A framework for incorporating is described such dependencies in a high-dimensional logistic regression model by introducing a quadratic interaction term designed to capture the pairwise interactions from the underlying network. The resulting model can also be viewed as an Ising model, where the node-dependent external fields linearly encode the high-dimensional covariates. A penalized maximum pseudo-likelihood method is used for estimating the network peer effect and the effect of the covariates (the regression coefficients), which conveniently avoids the computational intractability of the maximum likelihood approach. The results imply that even under network dependence, it is possible to consistently estimate the model parameters at the same rate as in classical logistic regression when the true parameter is sparse, and the underlying network is not too dense. The rates of consistency of the proposed estimator are also presented for various natural graphs ensembles, such as bounded degree graphs, sparse Erdos-Renyi random graphs, and stochastic block models.