Differentially Private Mixture of Generative Neural Networks

13 September 2017

G. Ács

Luca Melis

C. Castelluccia

Emiliano De Cristofaro

SyDa

ArXiv (abs)PDF HTML

Abstract

Generative models are used in an increasing number of applications that rely on large amounts of contextually rich information about individuals. Owing to possible privacy violations of individuals whose data is used to train these models, however, publishing or sharing generative models is not always viable. In this paper, we introduce a novel solution for privately releasing generative models and entire high-dimensional datasets produced by these models. We model the generator distribution of the training data by a mixture of $k$ generative neural networks. These are trained together and collectively learn the generator distribution of a dataset. Data is divided into $k$ clusters, using a novel differentially private kernel $k$ -means, then each cluster is given to separate generative neural networks, such as Restricted Boltzmann Machines or Variational Autoencoders, which are trained only on their own cluster using differentially private gradient descent. We evaluate our approach using the MNIST dataset and a large Call Detail Records dataset, showing that it produces realistic synthetic samples, which can also be used to accurately compute arbitrary number of counting queries.

View on arXiv

Comments on this paper