Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.04062
Cited By
Machine Learning for Synthetic Data Generation: A Review
8 February 2023
Ying-Cheng Lu
Minjie Shen
Huazheng Wang
Xiao Wang
Capucine Van Rechem
Tianfan Fu
Wenqi Wei
SyDa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Machine Learning for Synthetic Data Generation: A Review"
50 / 76 papers shown
Title
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning
Ke Wang
Junting Pan
Linda Wei
Aojun Zhou
Weikang Shi
...
Han Xiao
Yiran Yang
Houxing Ren
Mingjie Zhan
Hongsheng Li
29
0
0
15 May 2025
Quantitative Auditing of AI Fairness with Differentially Private Synthetic Data
Chih-Cheng Rex Yuan
Bow-Yaw Wang
52
0
0
30 Apr 2025
Securing the Skies: A Comprehensive Survey on Anti-UAV Methods, Benchmarking, and Future Directions
Yifei Dong
Fengyi Wu
Sanjian Zhang
Guangyu Chen
Yuzhi Hu
...
Jingdong Sun
Siyu Huang
Feng Liu
Qi Dai
Zhi-Qi Cheng
44
0
0
16 Apr 2025
Data Augmentation Through Random Style Replacement
Qikai Yang
Cheng Ji
Huaiying Luo
Panfeng Li
Zhicheng Ding
36
1
0
14 Apr 2025
ML For Hardware Design Interpretability: Challenges and Opportunities
Raymond Baartmans
Andrew Ensinger
Victor Agostinelli
Lizhong Chen
29
0
0
11 Apr 2025
Explainable AI for building energy retrofitting under data scarcity
Panagiota Rempi
Sotiris Pelekis
Alexandros-Menelaos Tzortzis
Evangelos Karakolis
Christos Ntanos
D. Askounis
33
1
0
08 Apr 2025
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Jay Lee
Hanqi Su
Dai-Yan Ji
Takanobu Minami
AI4CE
48
0
0
03 Apr 2025
Artificial Conversations, Real Results: Fostering Language Detection with Synthetic Data
Fatemeh Mohammadi
Tommaso Romano
S. Maghool
Paolo Ceravolo
SyDa
58
0
0
31 Mar 2025
Comparing Methods for Bias Mitigation in Graph Neural Networks
Barbara Hoffmann
R. Mayer
35
0
0
28 Mar 2025
Scale Efficient Training for Large Datasets
Qing Zhou
Junyu Gao
Qi Wang
DD
78
0
0
17 Mar 2025
Synthetic Data Generation of Body Motion Data by Neural Gas Network for Emotion Recognition
Seyed Muhammad Hossein Mousavi
47
0
0
11 Mar 2025
Steered Generation via Gradient Descent on Sparse Features
Sumanta Bhattacharyya
Pedram Rooshenas
LLMSV
43
0
0
25 Feb 2025
Synthetic Data Generation by Supervised Neural Gas Network for Physiological Emotion Recognition Data
S. Muhammad Hossein Mousavi
33
1
0
19 Jan 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Yiyao Yu
Xinzhe Ni
Zicheng Lin
Jin Zeng
Yujiu Yang
LRM
70
13
0
08 Jan 2025
Can Synthetic Data be Fair and Private? A Comparative Study of Synthetic Data Generation and Fairness Algorithms
Qinyi Liu
Oscar Blessed Deho
Farhad Vadiee
Mohammad Khalil
Srecko Joksimovic
George Siemens
SyDa
43
6
0
03 Jan 2025
Unleashing the Power of Data Tsunami: A Comprehensive Survey on Data Assessment and Selection for Instruction Tuning of Language Models
Yulei Qin
Yuncheng Yang
Pengcheng Guo
Gang Li
Hang Shao
Yuchen Shi
Zihan Xu
Yun Gu
Ke Li
Xing Sun
ALM
93
12
0
31 Dec 2024
Autonomous Crack Detection using Deep Learning on Synthetic Thermogram Datasets
Chinmay Makarand Pimpalkhare
D. N. Pawaskar
86
0
0
21 Dec 2024
Simulating Tabular Datasets through LLMs to Rapidly Explore Hypotheses about Real-World Entities
Miguel Zabaleta
Joel Lehman
72
0
0
27 Nov 2024
High-precision medical speech recognition through synthetic data and semantic correction: UNITED-MEDASR
Sourav Banerjee
Ayushi Agarwal
Promila Ghosh
81
3
0
24 Nov 2024
Time-Causal VAE: Robust Financial Time Series Generator
Beatrice Acciaio
Stephan Eckstein
Songyan Hou
AI4TS
30
2
0
05 Nov 2024
Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis
Mohammad Zbeeb
Mohammad Ghorayeb
Mariam Salman
37
0
0
04 Nov 2024
Medical Imaging Complexity and its Effects on GAN Performance
William Cagas
Chan Ko
Blake Hsiao
Shryuk Grandhi
Rishi Bhattacharya
Kevin Zhu
Michael Lam
MedIm
31
1
0
23 Oct 2024
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li
Zichun Yu
Chenyan Xiong
SyDa
33
1
0
18 Oct 2024
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
Binghui Wang
Weipeng Chen
Ji-Rong Wen
65
0
0
10 Oct 2024
Targeted synthetic data generation for tabular data via hardness characterization
Tommaso Ferracci
Leonie Goldmann
Anton Hinel
Francesco Sanna Passino
135
0
0
01 Oct 2024
1 Trillion Token (1TT) Platform: A Novel Framework for Efficient Data Sharing and Compensation in Large Language Models
Chanjun Park
Hyunsoo Ha
Jihoo Kim
Yungi Kim
Dahyun Kim
Sukyung Lee
Seonghoon Yang
34
0
0
30 Sep 2024
Towards Efficient and Robust VQA-NLE Data Generation with Large Vision-Language Models
Patrick Amadeus Irawan
Genta Indra Winata
Samuel Cahyawijaya
Ayu Purwarianti
34
0
0
23 Sep 2024
TrialSynth: Generation of Synthetic Sequential Clinical Trial Data
Chufan Gao
Mandis Beigi
Afrah Shafquat
Jacob Aptekar
Jimeng Sun
28
0
0
11 Sep 2024
Accelerated Markov Chain Monte Carlo Using Adaptive Weighting Scheme
Y Samuel Wang
Wenyu Chen
Shimin Shan
34
0
0
23 Aug 2024
Enhancing Eye-Tracking Performance through Multi-Task Learning Transformer
Weigeng Li
Neng Zhou
Xiaodong Qu
41
15
0
11 Aug 2024
Enhancing Representation Learning of EEG Data with Masked Autoencoders
Yifei Zhou
Sitong Liu
41
0
0
09 Aug 2024
Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing
Matthew L. Key
Tural Mehtiyev
Xiaodong Qu
MDE
34
13
0
06 Aug 2024
Effect of Kernel Size on CNN-Vision-Transformer-Based Gaze Prediction Using Electroencephalography Data
Chuhui Qiu
Bugao Liang
Matthew L. Key
36
0
0
06 Aug 2024
Integrating HCI Datasets in Project-Based Machine Learning Courses: A College-Level Review and Case Study
Xiaodong Qu
Matthew L. Key
Eric Luo
Chuhui Qiu
37
0
0
06 Aug 2024
EEGMobile: Enhancing Speed and Accuracy in EEG-Based Gaze Prediction with Advanced Mobile Architectures
Teng Liang
Andrews Damoah
45
0
0
06 Aug 2024
Light-weight Fine-tuning Method for Defending Adversarial Noise in Pre-trained Medical Vision-Language Models
Xu Han
Linghao Jin
Xuezhe Ma
Xiaofeng Liu
AAML
38
3
0
02 Jul 2024
A Survey on Data Quality Dimensions and Tools for Machine Learning
Yuhan Zhou
Fengjiao Tu
Kewei Sha
Junhua Ding
Haihua Chen
46
4
0
28 Jun 2024
WarCov -- Large multilabel and multimodal dataset from social platform
Weronika Borek-Marciniec
P. Zyblewski
Jakub Klikowski
Pawel Ksieniewicz
34
0
0
10 Jun 2024
Differentially Private Fine-Tuning of Diffusion Models
Yu-Lin Tsai
Yizhe Li
Zekai Chen
Po-yu Chen
Chia-Mu Yu
Xuebin Ren
Francois Buet-Golfouse
52
3
0
03 Jun 2024
Improving Text Generation on Images with Synthetic Captions
Jun Young Koh
Sang Hyun Park
Joy Song
DiffM
51
2
0
01 Jun 2024
A Correlation- and Mean-Aware Loss Function and Benchmarking Framework to Improve GAN-based Tabular Data Synthesis
Minh H. Vu
Daniel Edler
C. Wibom
Tommy Löfstedt
Beatrice Melin
M. Rosvall
35
2
0
27 May 2024
The Future of Large Language Model Pre-training is Federated
Lorenzo Sani
Alexandru Iacob
Zeyu Cao
Bill Marino
Yan Gao
...
Wanru Zhao
William F. Shen
Preslav Aleksandrov
Xinchi Qiu
Nicholas D. Lane
AI4CE
35
13
0
17 May 2024
Skip the Benchmark: Generating System-Level High-Level Synthesis Data using Generative Machine Learning
Yuchao Liao
Tosiron Adegbija
Roman L. Lysecky
Ravi Tandon
SyDa
26
0
0
23 Apr 2024
PATE-TripleGAN: Privacy-Preserving Image Synthesis with Gaussian Differential Privacy
Zepeng Jiang
Weiwei Ni
Yifan Zhang
PICV
21
1
0
19 Apr 2024
A Diffusion-based Data Generator for Training Object Recognition Models in Ultra-Range Distance
Eran Bamani
Eden Nissinman
L. Koenigsberg
Inbar Meir
A. Sintov
41
0
0
15 Apr 2024
Best Practices and Lessons Learned on Synthetic Data for Language Models
Ruibo Liu
Jerry W. Wei
Fangyu Liu
Chenglei Si
Yanzhe Zhang
...
Steven Zheng
Daiyi Peng
Diyi Yang
Denny Zhou
Andrew M. Dai
SyDa
EgoV
41
86
0
11 Apr 2024
Fusing Pretrained ViTs with TCNet for Enhanced EEG Regression
Eric Modesitt
Haicheng Yin
Williams Huang Wang
Brian Lu
37
2
0
02 Apr 2024
TWIN-GPT: Digital Twins for Clinical Trials via Large Language Model
Yue Wang
Tianfan Fu
Yinlong Xu
Zihan Ma
Hongxia Xu
Yingzhou Lu
Bang Du
Hong-Yan Gao
Jian Wu
LM&MA
50
27
0
01 Apr 2024
Towards In-Vehicle Multi-Task Facial Attribute Recognition: Investigating Synthetic Data and Vision Foundation Models
Esmaeil Seraj
Walter Talamonti
30
0
0
10 Mar 2024
NeSy is alive and well: A LLM-driven symbolic approach for better code comment data generation and classification
Hanna Abi Akl
44
0
0
25 Feb 2024
1
2
Next