36
35

Gradual Release of Sensitive Data under Differential Privacy

Abstract

We introduce the problem of releasing sensitive data under differential privacy when the privacy level is subject to change over time. Existing work assumes that privacy level is determined by the system designer as a fixed value before sensitive data is released. For certain applications, however, users may wish to relax the privacy level for subsequent releases of the same data after either a re-evaluation of the privacy concerns or the need for better accuracy. Specifically, given a database containing sensitive data, we assume that a response y1y_1 that preserves ϵ1\epsilon_{1}-differential privacy has already been published. Then, the privacy level is relaxed to ϵ2\epsilon_2, with ϵ2>ϵ1\epsilon_2 > \epsilon_1, and we wish to publish a more accurate response y2y_2 while the joint response (y1,y2)(y_1, y_2) preserves ϵ2\epsilon_2-differential privacy. How much accuracy is lost in the scenario of gradually releasing two responses y1y_1 and y2y_2 compared to the scenario of releasing a single response that is ϵ2\epsilon_{2}-differentially private? Our results show that there exists a composite mechanism that achieves \textit{no loss} in accuracy. We consider the case in which the private data lies within Rn\mathbb{R}^{n} with an adjacency relation induced by the 1\ell_{1}-norm, and we focus on mechanisms that approximate identity queries. We show that the same accuracy can be achieved in the case of gradual release through a mechanism whose outputs can be described by a \textit{lazy Markov stochastic process}. This stochastic process has a closed form expression and can be efficiently sampled. Our results are applicable beyond identity queries. To this end, we demonstrate that our results can be applied in several cases, including Google's RAPPOR project, trading of sensitive data, and controlled transmission of private data in a social network.

View on arXiv
Comments on this paper