How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models

17 February 2021

Papers citing "How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models"

50 / 108 papers shown

Title
Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A Comprehensive Benchmark Lasse Hansen Nabeel Seedat M. Schaar Andrija Petrović 44 19 0 25 Oct 2023
Can You Rely on Your Model Evaluation? Improving Model Evaluation with Synthetic Test Data B. V. Breugel Nabeel Seedat F. Imrie M. Schaar SyDa 26 19 0 25 Oct 2023
Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space Hengrui Zhang Jiani Zhang Balasubramaniam Srinivasan Zhengyuan Shen Xiao Qin Christos Faloutsos Huzefa Rangwala George Karypis DiffM 29 82 0 14 Oct 2023
ResBit: Residual Bit Vector for Categorical Values Masane Fuchi Amar Zanashir Si-Qing Chen Tomohiro Takagi 16 1 0 29 Sep 2023
A Unified View of Differentially Private Deep Generative Modeling Dingfan Chen Raouf Kerkouche Mario Fritz SyDa 21 4 0 27 Sep 2023
Security and Privacy on Generative Data in AIGC: A Survey Tao Wang Yushu Zhang Shuren Qi Ruoyu Zhao Zhihua Xia Jian Weng 56 44 0 18 Sep 2023
TSGBench: Time Series Generation Benchmark Yihao Ang Qiang Huang Yifan Bao Anthony K. H. Tung Zhiyong Huang AI4TS 31 16 0 07 Sep 2023
Probabilistic Precision and Recall Towards Reliable Evaluation of Generative Models Dogyun Park Suhyun Kim EGVM 14 3 0 04 Sep 2023
Generating tabular datasets under differential privacy G. Truda DiffM 22 5 0 28 Aug 2023
Steering Language Generation: Harnessing Contrastive Expert Guidance and Negative Prompting for Coherent and Diverse Synthetic Data Generation Charles OÑeill Y. Ting 丁 I. Ciucă Jack Miller Thang Bui SyDa 37 1 0 15 Aug 2023
Application-Oriented Benchmarking of Quantum Generative Learning Using QUARK Florian J. Kiwit M. Marso Philipp Ross Carlos A. Riofrío Johannes Klepsch André Luckow 28 10 0 08 Aug 2023
Deep Generative Models, Synthetic Tabular Data, and Differential Privacy: An Overview and Synthesis Conor Hassan Roberto Salomone Kerrie Mengersen 23 6 0 28 Jul 2023
TransFusion: Generating Long, High Fidelity Time Series using Diffusion Models with Transformers Md Fahim Sikder R. Ramachandranpillai Fredrik Heintz DiffM 28 9 0 24 Jul 2023
MargCTGAN: A "Marginally'' Better CTGAN for the Low Sample Regime Tejumade Afonja Dingfan Chen Mario Fritz 9 7 0 16 Jul 2023
On the Challenges of Deploying Privacy-Preserving Synthetic Data in the Enterprise L. Arthur Jason W Costello Jonathan Hardy Will O'Brien J. Rea Gareth Rees Georgi Ganev 24 2 0 09 Jul 2023
On the Constrained Time-Series Generation Problem Andrea Coletta Sriram Gopalakrishnan Daniel Borrajo Svitlana Vyetrenko DiffM AI4TS 31 34 0 04 Jul 2023
DiffInfinite: Large Mask-Image Synthesis via Parallel Random Patch Diffusion in Histopathology Marco Aversa Gabriel Nobis Miriam Hagele Kai Standvoss Mihaela Chirica ... D. Ivanova Wojciech Samek Frederick Klauschen B. Sanguinetti Luis Oala MedIm 27 19 0 23 Jun 2023
Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback Shenghuan Sun Gregory M. Goldgof A. Butte Ahmed Alaa MedIm 19 12 0 16 Jun 2023
Emergent Asymmetry of Precision and Recall for Measuring Fidelity and Diversity of Generative Models in High Dimensions Mahyar Khayatkhoei Wael AbdAlmageed 25 3 0 16 Jun 2023
Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models G. Stein Jesse C. Cresswell Rasa Hosseinzadeh Yi Sui Brendan Leigh Ross Valentin Villecroze Zhaoyan Liu Anthony L. Caterini J. E. T. Taylor G. Loaiza-Ganem EGVM 33 95 0 07 Jun 2023
GAT-GAN : A Graph-Attention-based Time-Series Generative Adversarial Network Srikrishna Iyer Teck-Hou Teng AI4TS 16 1 0 03 Jun 2023
Privacy Distillation: Reducing Re-identification Risk of Multimodal Diffusion Models Virginia Fernandez Pedro Sanchez W. H. Pinaya Grzegorz Jacenków Sotirios A. Tsaftaris Jorge Cardoso 29 17 0 02 Jun 2023
Challenges and Remedies to Privacy and Security in AIGC: Exploring the Potential of Privacy Computing, Blockchain, and Beyond Chuan Chen Zhenpeng Wu Yan-Hao Lai Wen-chao Ou Tianchi Liao Zibin Zheng 25 32 0 01 Jun 2023
Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science V. Veselovsky Manoel Horta Ribeiro Akhil Arora Martin Josifoski Ashton Anderson Robert West SyDa HILM 35 31 0 24 May 2023
Synthetic data, real errors: how (not) to publish and use synthetic data B. V. Breugel Zhaozhi Qian M. Schaar SyDa 62 28 0 16 May 2023
Auditing and Generating Synthetic Data with Controllable Trust Trade-offs Brian M. Belgodere Pierre L. Dognin Adam Ivankay Igor Melnyk Youssef Mroueh ... Mattia Rigotti Jerret Ross Yair Schiff Radhika Vedpathak Richard A. Young 26 12 0 21 Apr 2023
Beyond Privacy: Navigating the Opportunities and Challenges of Synthetic Data B. V. Breugel M. Schaar 19 26 0 07 Apr 2023
Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation Mayu Otani Riku Togashi Yu Sawai Ryosuke Ishigami Yuta Nakashima Esa Rahtu J. Heikkilä Shiníchi Satoh 38 62 0 04 Apr 2023
A Guide for Practical Use of ADMG Causal Data Augmentation Audrey Poinsot Alessandro Leite CML 20 2 0 03 Apr 2023
A Framework for Demonstrating Practical Quantum Advantage: Racing Quantum against Classical Generative Models Mohamed Hibat-Allah M. Mauri Juan Carrasquilla A. Perdomo-Ortiz 24 10 0 27 Mar 2023
Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences between Pretrained Generative Models Matthew Lyle Olson Shusen Liu Rushil Anirudh Jayaraman J. Thiagarajan P. Bremer Weng-Keen Wong 28 5 0 19 Mar 2023
Synthetic Data Generator for Adaptive Interventions in Global Health Aditya Rastogi J. F. Garamendi Ana Fernández del Río Anna Guitart Moiz Hassan Khan Dexian Tang África Periánez 28 0 0 03 Mar 2023
SurvivalGAN: Generating Time-to-Event Data for Survival Analysis Alexander Norcliffe B. Cebere F. Imrie Pietro Lió M. Schaar SyDa 41 14 0 24 Feb 2023
Membership Inference Attacks against Synthetic Data through Overfitting Detection B. V. Breugel Hao Sun Zhaozhi Qian M. Schaar 33 45 0 24 Feb 2023
Feature Likelihood Divergence: Evaluating the Generalization of Generative Models Using Samples Marco Jiralerspong A. Bose I. Gemp Chongli Qin Yoram Bachrach Gauthier Gidel EGVM 32 5 0 09 Feb 2023
Diffusion-based Conditional ECG Generation with Structured State Space Models Juan Miguel Lopez Alcaraz Nils Strodthoff DiffM 20 48 0 19 Jan 2023
Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models Gowthami Somepalli Vasu Singla Micah Goldblum Jonas Geiping Tom Goldstein 26 303 0 07 Dec 2022
Generating Synthetic Data in a Secure Federated General Adversarial Networks for a Consortium of Health Registries N. Veeraragavan J. Nygaard 16 2 0 03 Dec 2022
On the Utility Recovery Incapability of Neural Net-based Differential Private Tabular Training Data Synthesizer under Privacy Deregulation Yucong Liu ChiHua Wang Guang Cheng 29 7 0 28 Nov 2022
Multi-level Data Representation For Training Deep Helmholtz Machines J. M. Ramos Luis Sa-Couto Andreas Wichert 18 0 0 26 Oct 2022
Mitigating Health Data Poverty: Generative Approaches versus Resampling for Time-series Clinical Data Raffaele Marchesi Nicolo Micheletti Giuseppe Jurman V. Osmani AI4TS 37 5 0 25 Oct 2022
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data Nabeel Seedat Jonathan Crabbé Ioana Bica M. Schaar 19 24 0 24 Oct 2022
Evaluation of the Synthetic Electronic Health Records E. Muller Xu Zheng Jer Hayes ELM 14 1 0 16 Oct 2022
Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning Alex J. Chan M. Schaar OOD 34 1 0 11 Oct 2022
Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via Simulation-based Synthetic Data Augmentation and Multitask Learning Riccardo Finotello D. L’hermite Celine Quéré Benjamin Rouge M. Tamaazousti J. Sirven 22 1 0 07 Oct 2022
StyleTime: Style Transfer for Synthetic Time Series Generation Yousef El-Laham Svitlana Vyetrenko AI4TS 26 5 0 22 Sep 2022
CAT: Controllable Attribute Translation for Fair Facial Attribute Classification Jiazhi Li Wael AbdAlmageed CVBM 24 8 0 14 Sep 2022
GAN-based generative modelling for dermatological applications -- comparative study Sandra Carrasco Limeros Sylwia Majchrowska Mohamad Khir Zoubi Anna Rosén Juulia Suvilehto Lisa Sjöblom Magnus Kjellberg MedIm 6 2 0 24 Aug 2022
Convergence of denoising diffusion models under the manifold hypothesis Valentin De Bortoli DiffM 24 158 0 10 Aug 2022
Synthetic Data -- what, why and how? James Jordon Lukasz Szpruch F. Houssiau M. Bottarelli Giovanni Cherubin Carsten Maple Samuel N. Cohen Adrian Weller 40 109 0 06 May 2022