30
108
v1v2 (latest)

LoPub: High-Dimensional Crowdsourced Data Publication with Local Differential Privacy

Abstract

High-dimensional crowdsourced data collected from a large number of users produces rich knowledge for our society. However, it also brings unprecedented privacy threats to participants. Local privacy, a variant of differential privacy, is proposed as a means to eliminate the privacy concern. Unfortunately, achieving local privacy on high-dimensional crowdsourced data raises great challenges on both efficiency and effectiveness. Here, based on EM and Lasso regression, we propose efficient multi-dimensional joint distribution estimation algorithms with local privacy. Then, we develop a Locally privacy-preserving high-dimensional data Publication algorithm, LoPub, by taking advantage of our distribution estimation techniques. In particular, both correlations and joint distribution among multiple attributes can be identified to reduce the dimension of crowdsourced data, thus achieving both efficiency and effectiveness in locally private high-dimensional data publication. Extensive experiments on real-world datasets demonstrated that the efficiency of our multivariate distribution estimation scheme and confirm the effectiveness of our LoPub scheme in generating approximate datasets with local privacy.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.