ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.08254
110
10
v1v2v3 (latest)

Balanced Mixed-Type Tabular Data Synthesis with Diffusion Models

12 April 2024
Zeyu Yang
Peikun Guo
Khadija Zanna
Akane Sano
Xiaoxue Yang
Akane Sano
    DiffM
ArXiv (abs)PDFHTML
Abstract

Diffusion models have emerged as a robust framework for various generative tasks, including tabular data synthesis. However, current tabular diffusion models tend to inherit bias in the training dataset and generate biased synthetic data, which may influence discriminatory actions. In this research, we introduce a novel tabular diffusion model that incorporates sensitive guidance to generate fair synthetic data with balanced joint distributions of the target label and sensitive attributes, such as sex and race. The empirical results demonstrate that our method effectively mitigates bias in training data while maintaining the quality of the generated samples. Furthermore, we provide evidence that our approach outperforms existing methods for synthesizing tabular data on fairness metrics such as demographic parity ratio and equalized odds ratio, achieving improvements of over 10%10\%10%. Our implementation is available at https://github.com/comp-well-org/fair-tab-diffusion.

View on arXiv
@article{yang2025_2404.08254,
  title={ Balanced Mixed-Type Tabular Data Synthesis with Diffusion Models },
  author={ Zeyu Yang and Han Yu and Peikun Guo and Khadija Zanna and Xiaoxue Yang and Akane Sano },
  journal={arXiv preprint arXiv:2404.08254},
  year={ 2025 }
}
Comments on this paper