ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.04538
  4. Cited By
Generating Training Data with Language Models: Towards Zero-Shot
  Language Understanding
v1v2 (latest)

Generating Training Data with Language Models: Towards Zero-Shot Language Understanding

9 February 2022
Yu Meng
Jiaxin Huang
Yu Zhang
Jiawei Han
    SyDa
ArXiv (abs)PDFHTML

Papers citing "Generating Training Data with Language Models: Towards Zero-Shot Language Understanding"

50 / 87 papers shown
Title
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data
Haoxin Li
Boyang Li
CoGe
145
1
0
03 Mar 2025
BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training
BERTtime Stories: Investigating the Role of Synthetic Story Data in Language Pre-training
Nikitas Theodoropoulos
Giorgos Filandrianos
Vassilis Lyberatos
Maria Lymperaiou
Giorgos Stamou
SyDa
195
1
0
24 Feb 2025
Synthetic vs. Gold: The Role of LLM-Generated Labels and Data in Cyberbullying Detection
Synthetic vs. Gold: The Role of LLM-Generated Labels and Data in Cyberbullying Detection
Arefeh Kazemi
Sri Balaaji Natarajan Kalaivendan
Joachim Wagner
Hamza Qadeer
Brian Davis
130
1
0
21 Feb 2025
Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data
Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data
Shenglai Zeng
Jiankun Zhang
Pengfei He
J. Ren
Tianqi Zheng
Hanqing Lu
Han Xu
Hui Liu
Yue Xing
Jiliang Tang
182
12
0
21 Feb 2025
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Ran Xu
Hejie Cui
Yue Yu
Xuan Kan
Wenqi Shi
Yuchen Zhuang
Wei Jin
Joyce C. Ho
Carl Yang
154
16
0
28 Jan 2025
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
479
0
0
01 Dec 2024
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs
Suhas S Kowshik
Abhishek Divekar
Vijit Malik
SyDa
128
0
0
13 Nov 2024
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
Hsun-Yu Kuo
Yin-Hsiang Liao
Yu-Chieh Chao
Wei-Yun Ma
Pu-Jen Cheng
SyDa
99
4
0
28 Oct 2024
Self-calibration for Language Model Quantization and Pruning
Self-calibration for Language Model Quantization and Pruning
Miles Williams
G. Chrysostomou
Nikolaos Aletras
MQ
461
0
0
22 Oct 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui
Zhaoxiao Guo
Hang Zhao
Juanyong Duan
Congrui Huang
97
7
0
23 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
189
32
0
10 Sep 2024
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Jiaming Shen
Ran Xu
Yennie Jun
Zhen Qin
Tianqi Liu
Carl Yang
Yi Liang
Simon Baumgartner
Michael Bendersky
SyDa
114
5
0
22 Jul 2024
Contrastive Learning from Synthetic Audio Doppelgängers
Contrastive Learning from Synthetic Audio Doppelgängers
Manuel Cherep
Nikhil Singh
83
1
0
09 Jun 2024
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision
Yunyi Zhang
Ruozhen Yang
Xueqiang Xu
Rui Li
Jinfeng Xiao
Jiaming Shen
Jiawei Han
98
17
0
29 Feb 2024
Pretraining Text Encoders with Adversarial Mixture of Training Signal
  Generators
Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
MoE
75
16
0
07 Apr 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
Jiacheng Ye
Jiahui Gao
Qintong Li
Hang Xu
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
101
220
0
16 Feb 2022
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves
  Zero-Shot Generalization
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Hanwei Xu
Yujun Chen
Yulun Du
Nan Shao
Yanggang Wang
Haiyu Li
Zhilin Yang
VLMLRMAI4CE
75
69
0
18 Jan 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
355
1,708
0
15 Oct 2021
A Plug-and-Play Method for Controlled Text Generation
A Plug-and-Play Method for Controlled Text Generation
Damian Pascual
Béni Egressy
Clara Meister
Ryan Cotterell
Roger Wattenhofer
122
93
0
20 Sep 2021
Towards Zero-Label Language Learning
Towards Zero-Label Language Learning
Zirui Wang
Adams Wei Yu
Orhan Firat
Yuan Cao
SyDa
242
105
0
19 Sep 2021
What Changes Can Large-scale Language Models Bring? Intensive Study on
  HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim
Hyoungseok Kim
Sang-Woo Lee
Gichang Lee
Donghyun Kwak
...
Jaewook Kang
Inho Kang
Jung-Woo Ha
W. Park
Nako Sung
VLM
288
123
0
10 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALMUQCV
227
3,782
0
03 Sep 2021
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot
  Learners
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
Ningyu Zhang
Luoqiu Li
Xiang Chen
Shumin Deng
Zhen Bi
Chuanqi Tan
Fei Huang
Huajun Chen
VLM
121
179
0
30 Aug 2021
Controlled Text Generation as Continuous Optimization with Multiple
  Constraints
Controlled Text Generation as Continuous Optimization with Multiple Constraints
Sachin Kumar
Eric Malmi
Aliaksei Severyn
Yulia Tsvetkov
BDLAI4CE
84
79
0
04 Aug 2021
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
  Language Models
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
Robert L Logan IV
Ivana Balavzević
Eric Wallace
Fabio Petroni
Sameer Singh
Sebastian Riedel
VPVLM
92
211
0
24 Jun 2021
BARTScore: Evaluating Generated Text as Text Generation
BARTScore: Evaluating Generated Text as Text Generation
Weizhe Yuan
Graham Neubig
Pengfei Liu
119
849
0
22 Jun 2021
True Few-Shot Learning with Language Models
True Few-Shot Learning with Language Models
Ethan Perez
Douwe Kiela
Kyunghyun Cho
137
439
0
24 May 2021
DExperts: Decoding-Time Controlled Text Generation with Experts and
  Anti-Experts
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts
Alisa Liu
Maarten Sap
Ximing Lu
Swabha Swayamdipta
Chandra Bhagavatula
Noah A. Smith
Yejin Choi
MU
115
375
0
07 May 2021
Understanding Factuality in Abstractive Summarization with FRANK: A
  Benchmark for Factuality Metrics
Understanding Factuality in Abstractive Summarization with FRANK: A Benchmark for Factuality Metrics
Artidoro Pagnoni
Vidhisha Balachandran
Yulia Tsvetkov
HILM
276
311
0
27 Apr 2021
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in
  NLP
CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP
Qinyuan Ye
Bill Yuchen Lin
Xiang Ren
288
185
0
18 Apr 2021
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
94
242
0
18 Apr 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILawSSL
278
3,411
0
18 Apr 2021
Cross-Task Generalization via Natural Language Crowdsourcing
  Instructions
Cross-Task Generalization via Natural Language Crowdsourcing Instructions
Swaroop Mishra
Daniel Khashabi
Chitta Baral
Hannaneh Hajishirzi
LRM
164
752
0
18 Apr 2021
Generating Datasets with Pretrained Language Models
Generating Datasets with Pretrained Language Models
Timo Schick
Hinrich Schütze
152
235
0
15 Apr 2021
FUDGE: Controlled Text Generation With Future Discriminators
FUDGE: Controlled Text Generation With Future Discriminators
Kevin Kaichuang Yang
Dan Klein
107
336
0
12 Apr 2021
Adapting Language Models for Zero-shot Learning by Meta-tuning on
  Dataset and Prompt Collections
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Ruiqi Zhong
Kristy Lee
Zheng Zhang
Dan Klein
97
173
0
10 Apr 2021
Improving and Simplifying Pattern Exploiting Training
Improving and Simplifying Pattern Exploiting Training
Derek Tam
Rakesh R Menon
Joey Tianyi Zhou
Shashank Srivastava
Colin Raffel
74
151
0
22 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
168
1,179
0
18 Mar 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
160
302
0
15 Mar 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model
  Pretraining
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Yu Meng
Chenyan Xiong
Payal Bajaj
Saurabh Tiwary
Paul N. Bennett
Jiawei Han
Xia Song
180
205
0
16 Feb 2021
Neural Data Augmentation via Example Extrapolation
Neural Data Augmentation via Example Extrapolation
Kenton Lee
Kelvin Guu
Luheng He
Timothy Dozat
Hyung Won Chung
74
72
0
02 Feb 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
250
4,299
0
01 Jan 2021
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
404
1,972
0
31 Dec 2020
Few-Shot Text Generation with Pattern-Exploiting Training
Few-Shot Text Generation with Pattern-Exploiting Training
Timo Schick
Hinrich Schütze
93
147
0
22 Dec 2020
A Distributional Approach to Controlled Text Generation
A Distributional Approach to Controlled Text Generation
Muhammad Khalifa
Hady ElSahar
Marc Dymetman
153
119
0
21 Dec 2020
PowerTransformer: Unsupervised Controllable Revision for Biased Language
  Correction
PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction
Xinyao Ma
Maarten Sap
Hannah Rashkin
Yejin Choi
82
73
0
26 Oct 2020
Universal Natural Language Processing with Limited Annotations: Try
  Few-shot Textual Entailment as a Start
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start
Wenpeng Yin
Nazneen Rajani
Dragomir R. Radev
R. Socher
Caiming Xiong
74
69
0
06 Oct 2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
  Models
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
163
1,214
0
24 Sep 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot
  Learners
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
130
974
0
15 Sep 2020
GeDi: Generative Discriminator Guided Sequence Generation
GeDi: Generative Discriminator Guided Sequence Generation
Ben Krause
Akhilesh Deepak Gotmare
Bryan McCann
N. Keskar
Shafiq Joty
R. Socher
Nazneen Rajani
134
408
0
14 Sep 2020
12
Next