A1350
Title: Statistical inference for privatized data with unknown sample size
Authors: Andres Barrientos - Florida State University (United States) [presenting]
Jordan Awan - Purdue University (United States)
Nianqiao Ju - Purdue University (United States)
Abstract: Theory and algorithms are developed to analyze privatized data in unbounded differential privacy(DP), where even the sample size is considered a sensitive quantity that requires privacy protection. It is shown that the distance between the sampling distributions under unbounded DP and bounded DP goes to zero as the sample size $n$ goes to infinity, provided that the noise used to privatize $n$ is at an appropriate rate; ABC-type posterior distributions are also established to converge under similar assumptions. Asymptotic results are further given in the regime where the privacy budget for n goes to zero, establishing the similarity of sampling distributions as well as showing that the MLE in the unbounded setting converges to the bounded-DP MLE. In order to facilitate valid, finite-sample Bayesian inference on privatized data in the unbounded DP setting, a reversible jump MCMC algorithm is proposed, which extends the data augmentation MCMC of another study. A Monte Carlo EM algorithm is also proposed to compute the MLE from privatized data in both bounded and unbounded DP. The focus is on describing these results and discussing how they can be used to select an appropriate differentially private framework for validation servers. A validation server is a secure system that allows users to query data while ensuring the privacy of the data subjects by only returning results that adhere to specified privacy constraints.