The Inadequacy of Similarity-based Privacy Metrics: Privacy Attacks against "Truly Anonymous" Synthetic Datasets

IEEE Symposium on Security and Privacy (S&P), 2023

8 December 2023

Georgi Ganev

Emiliano De Cristofaro

MIACV

ArXiv (abs)PDF HTML

Main:13 Pages

13 Figures

Bibliography:3 Pages

8 Tables

Appendix:3 Pages

Abstract

Generative models producing synthetic data are meant to provide a privacy-friendly approach to releasing data. However, their privacy guarantees are only considered robust when models satisfy Differential Privacy (DP). Alas, this is not a ubiquitous standard, as many leading companies (and, in fact, research papers) use ad-hoc privacy metrics based on testing the statistical similarity between synthetic and real data.

View on arXiv

Comments on this paper