ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.15126
  4. Cited By
On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A
  Survey

On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey

14 June 2024
Lin Long
Rui Wang
Ruixuan Xiao
Junbo Zhao
Xiao Ding
Gang Chen
Haobo Wang
    SyDa
ArXivPDFHTML

Papers citing "On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey"

32 / 32 papers shown
Title
Synthline: A Product Line Approach for Synthetic Requirements Engineering Data Generation using Large Language Models
Synthline: A Product Line Approach for Synthetic Requirements Engineering Data Generation using Large Language Models
Abdelkarim El-Hajjami
Camille Salinesi
SyDa
34
0
0
06 May 2025
A Typology of Synthetic Datasets for Dialogue Processing in Clinical Contexts
A Typology of Synthetic Datasets for Dialogue Processing in Clinical Contexts
Steven Bedrick
A. Seza Doğruöz
Sergiu Nisioi
131
0
0
05 May 2025
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks
AKD : Adversarial Knowledge Distillation For Large Language Models Alignment on Coding tasks
Ilyas Oulkadda
Julien Perez
ALM
42
0
0
05 May 2025
An LLM-Empowered Low-Resolution Vision System for On-Device Human Behavior Understanding
An LLM-Empowered Low-Resolution Vision System for On-Device Human Behavior Understanding
Siyang Jiang
Bufang Yang
Lilin Xu
Mu Yuan
Yeerzhati Abudunuer
...
Liekang Zeng
Hongkai Chen
Zhenyu Yan
Xiaofan Jiang
Guoliang Xing
VLM
86
0
0
03 May 2025
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models
TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models
Mihai Nadas
Laura Diosan
Andrei Piscoran
Andreea Tomescu
VGen
57
0
0
29 Apr 2025
LLM-based Semantic Augmentation for Harmful Content Detection
LLM-based Semantic Augmentation for Harmful Content Detection
Elyas Meguellati
Assaad Zeghina
S. Sadiq
Gianluca Demartini
34
0
0
22 Apr 2025
Leveraging LLMs for User Stories in AI Systems: UStAI Dataset
Leveraging LLMs for User Stories in AI Systems: UStAI Dataset
Asma Z. Yamani
Malak Baslyman
Moataz Ahmed
28
0
0
01 Apr 2025
Synthetic News Generation for Fake News Classification
Synthetic News Generation for Fake News Classification
Abdul Sittar
Luka Golob
Mateja Smiljanic
35
0
0
31 Mar 2025
Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs?
Who Relies More on World Knowledge and Bias for Syntactic Ambiguity Resolution: Humans or LLMs?
So Young Lee
Russell Scheinberg
Amber Shore
Ameeta Agrawal
46
1
0
13 Mar 2025
Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems
Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems
Matthew Barker
Andrew Bell
Evan Thomas
James Carr
Thomas Andrews
Umang Bhatt
80
1
0
25 Feb 2025
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking
Yi-Ling Chung
Aurora Cobo
Pablo Serna
SyDa
HILM
58
0
0
24 Feb 2025
Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias
Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias
Enzo Doyen
Amalia Todirascu
40
0
0
14 Feb 2025
Measuring Diversity in Synthetic Datasets
Measuring Diversity in Synthetic Datasets
Yuchang Zhu
Huizhe Zhang
Bingzhe Wu
Jintang Li
Zibin Zheng
Peilin Zhao
Liang Chen
Yatao Bian
100
0
0
12 Feb 2025
Few-shot LLM Synthetic Data with Distribution Matching
Few-shot LLM Synthetic Data with Distribution Matching
Jiyuan Ren
Zhaocheng Du
Zhihao Wen
Qinglin Jia
Sunhao Dai
Chuhan Wu
Zhenhua Dong
SyDa
77
0
0
09 Feb 2025
MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification
MAG-V: A Multi-Agent Framework for Synthetic Data Generation and Verification
Saptarshi Sengupta
Kristal Curtis
Akshay Mallipeddi
Abhinav Mathur
Joseph Ross
Liang Gou
Liang Gou
LLMAG
SyDa
125
1
0
28 Nov 2024
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
A Flexible Large Language Models Guardrail Development Methodology Applied to Off-Topic Prompt Detection
Gabriel Chua
Shing Yee Chan
Shaun Khoo
75
1
0
20 Nov 2024
Mastering the Craft of Data Synthesis for CodeLLMs
Mastering the Craft of Data Synthesis for CodeLLMs
Meng Chen
Philip Arthur
Qianyu Feng
Cong Duy Vu Hoang
Yu-Heng Hong
...
Mark Johnson
K. K.
Don Dharmasiri
Long Duong
Yuan-Fang Li
SyDa
58
1
0
16 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
139
0
0
07 Oct 2024
Exploring LLM-based Data Annotation Strategies for Medical Dialogue
  Preference Alignment
Exploring LLM-based Data Annotation Strategies for Medical Dialogue Preference Alignment
Chengfeng Dou
Y. Zhang
Zhi Jin
Wenpin Jiao
Haiyan Zhao
Yongqiang Zhao
Zhengwei Tao
30
0
0
05 Oct 2024
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data
Sreyan Ghosh
Sonal Kumar
Zhifeng Kong
Rafael Valle
Bryan Catanzaro
Dinesh Manocha
DiffM
47
2
0
02 Oct 2024
Efficacy of Synthetic Data as a Benchmark
Efficacy of Synthetic Data as a Benchmark
Gaurav Maheshwari
Dmitry Ivanov
Kevin El Haddad
SyDa
18
6
0
18 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
63
23
0
10 Sep 2024
RAGent: Retrieval-based Access Control Policy Generation
RAGent: Retrieval-based Access Control Policy Generation
Sakuna Jayasundara
N. Arachchilage
Giovanni Russello
51
1
0
08 Sep 2024
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and
  the Case of Information Extraction
Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction
Martin Josifoski
Marija Sakota
Maxime Peyrard
Robert West
SyDa
56
78
0
07 Mar 2023
Mixture of Soft Prompts for Controllable Data Generation
Mixture of Soft Prompts for Controllable Data Generation
Derek Chen
Celine Lee
Yunan Lu
Domenic Rosati
Zhou Yu
109
22
0
02 Mar 2023
ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback
ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback
Jiacheng Ye
Jiahui Gao
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
VLM
73
70
0
22 Oct 2022
Creating Training Sets via Weak Indirect Supervision
Creating Training Sets via Weak Indirect Supervision
Jieyu Zhang
Bohan Wang
Xiangchen Song
Yujing Wang
Yaming Yang
Jing Bai
Alexander Ratner
OffRL
51
17
0
07 Oct 2021
What Makes Good In-Context Examples for GPT-$3$?
What Makes Good In-Context Examples for GPT-333?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAML
RALM
275
1,312
0
17 Jan 2021
Efficient Intent Detection with Dual Sentence Encoders
Efficient Intent Detection with Dual Sentence Encoders
I. Casanueva
Tadas Temvcinas
D. Gerz
Matthew Henderson
Ivan Vulić
VLM
180
451
0
10 Mar 2020
A Survey on Knowledge Graphs: Representation, Acquisition and
  Applications
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Shaoxiong Ji
Shirui Pan
Erik Cambria
Pekka Marttinen
Philip S. Yu
181
1,940
0
02 Feb 2020
FewRel 2.0: Towards More Challenging Few-Shot Relation Classification
FewRel 2.0: Towards More Challenging Few-Shot Relation Classification
Tianyu Gao
Xu Han
Hao Zhu
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
205
244
0
16 Oct 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1