Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2202.07922
Cited By
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
16 February 2022
Jiacheng Ye
Jiahui Gao
Qintong Li
Hang Xu
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ZeroGen: Efficient Zero-shot Learning via Dataset Generation"
50 / 70 papers shown
Title
Mitigating the Privacy Issues in Retrieval-Augmented Generation (RAG) via Pure Synthetic Data
Shenglai Zeng
Jiankun Zhang
Pengfei He
J. Ren
Tianqi Zheng
Hanqing Lu
Han Xu
Hui Liu
Yue Xing
Jiliang Tang
160
12
0
21 Feb 2025
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Vardaan Pahuja
Yadong Lu
Corby Rosset
Boyu Gou
Arindam Mitra
Spencer Whitehead
Yu Su
Ahmed Awadallah
LLMAG
LM&Ro
Presented at
ResearchTrend Connect | LLMAG
on
14 Mar 2025
180
5
1
17 Feb 2025
Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models
Ran Xu
Hejie Cui
Yue Yu
Xuan Kan
Wenqi Shi
Yuchen Zhuang
Wei Jin
Joyce C. Ho
Carl Yang
106
16
0
28 Jan 2025
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
384
0
0
01 Dec 2024
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs
Suhas S Kowshik
Abhishek Divekar
Vijit Malik
SyDa
95
0
0
13 Nov 2024
Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification
Hsun-Yu Kuo
Yin-Hsiang Liao
Yu-Chieh Chao
Wei-Yun Ma
Pu-Jen Cheng
SyDa
82
3
0
28 Oct 2024
ToxiCraft: A Novel Framework for Synthetic Generation of Harmful Information
Zheng Hui
Zhaoxiao Guo
Hang Zhao
Juanyong Duan
Congrui Huang
72
7
0
23 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
142
26
0
10 Sep 2024
Boosting Reward Model with Preference-Conditional Multi-Aspect Synthetic Data Generation
Jiaming Shen
Ran Xu
Yennie Jun
Zhen Qin
Tianqi Liu
Carl Yang
Yi Liang
Simon Baumgartner
Michael Bendersky
SyDa
85
5
0
22 Jul 2024
TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision
Yunyi Zhang
Ruozhen Yang
Xueqiang Xu
Rui Li
Jinfeng Xiao
Jiaming Shen
Jiawei Han
57
14
0
29 Feb 2024
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
284
3,583
0
02 May 2022
In-Context Learning for Few-Shot Dialogue State Tracking
Yushi Hu
Chia-Hsuan Lee
Tianbao Xie
Tao Yu
Noah A. Smith
Mari Ostendorf
BDL
67
60
0
16 Mar 2022
Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
Yu Meng
Jiaxin Huang
Yu Zhang
Jiawei Han
SyDa
45
235
0
09 Feb 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
587
9,009
0
28 Jan 2022
ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Hanwei Xu
Yujun Chen
Yulun Du
Nan Shao
Yanggang Wang
Haiyu Li
Zhilin Yang
VLM
LRM
AI4CE
58
69
0
18 Jan 2022
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation
Alisa Liu
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
120
219
0
16 Jan 2022
Data-Free Knowledge Transfer: A Survey
Yuang Liu
Wei Zhang
Jun Wang
Jianyong Wang
73
48
0
31 Dec 2021
A Survey on Green Deep Learning
Jingjing Xu
Wangchunshu Zhou
Zhiyi Fu
Hao Zhou
Lei Li
VLM
119
83
0
08 Nov 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
309
1,679
0
15 Oct 2021
Towards Zero-Label Language Learning
Zirui Wang
Adams Wei Yu
Orhan Firat
Yuan Cao
SyDa
223
103
0
19 Sep 2021
Reframing Instructional Prompts to GPTk's Language
Swaroop Mishra
Daniel Khashabi
Chitta Baral
Yejin Choi
Hannaneh Hajishirzi
59
216
0
16 Sep 2021
STraTA: Self-Training with Task Augmentation for Better Few-shot Learning
Tu Vu
Minh-Thang Luong
Quoc V. Le
Grady Simon
Mohit Iyyer
142
61
0
13 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
69
3,678
0
03 Sep 2021
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Pengfei Liu
Weizhe Yuan
Jinlan Fu
Zhengbao Jiang
Hiroaki Hayashi
Graham Neubig
VLM
SyDa
165
3,934
0
28 Jul 2021
One2Set: Generating Diverse Keyphrases as a Set
Jiacheng Ye
Tao Gui
Yichao Luo
Yige Xu
Qi Zhang
40
80
0
24 May 2021
Surface Form Competition: Why the Highest Probability Answer Isn't Always Right
Ari Holtzman
Peter West
Vered Schwartz
Yejin Choi
Luke Zettlemoyer
LRM
51
234
0
16 Apr 2021
Generating Datasets with Pretrained Language Models
Timo Schick
Hinrich Schütze
124
235
0
15 Apr 2021
Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections
Ruiqi Zhong
Kristy Lee
Zheng Zhang
Dan Klein
75
171
0
10 Apr 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
280
380
0
28 Feb 2021
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Laria Reynolds
Kyle McDonell
82
877
0
15 Feb 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
89
347
0
02 Feb 2021
Neural Data Augmentation via Example Extrapolation
Kenton Lee
Kelvin Guu
Luheng He
Timothy Dozat
Hyung Won Chung
33
72
0
02 Feb 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
57
2,136
0
11 Jan 2021
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
133
348
0
05 Jan 2021
Towards Zero-Shot Knowledge Distillation for Natural Language Processing
Ahmad Rashid
Vasileios Lioutas
Abbas Ghaddar
Mehdi Rezagholizadeh
68
27
0
31 Dec 2020
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
Jian Guan
Minlie Huang
49
70
0
16 Sep 2020
Learning from Noisy Labels with Deep Neural Networks: A Survey
Hwanjun Song
Minseok Kim
Dongmin Park
Yooju Shin
Jae-Gil Lee
NoLa
84
979
0
16 Jul 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
507
41,106
0
28 May 2020
UnifiedQA: Crossing Format Boundaries With a Single QA System
Daniel Khashabi
Sewon Min
Tushar Khot
Ashish Sabharwal
Oyvind Tafjord
Peter Clark
Hannaneh Hajishirzi
105
731
0
02 May 2020
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Zhiqing Sun
Hongkun Yu
Xiaodan Song
Renjie Liu
Yiming Yang
Denny Zhou
MQ
90
807
0
06 Apr 2020
Training Question Answering Models From Synthetic Data
Raul Puri
Ryan Spring
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
ELM
70
159
0
22 Feb 2020
Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
Mitchell A. Gordon
Kevin Duh
Nicholas Andrews
VLM
41
339
0
19 Feb 2020
Beat the AI: Investigating Adversarial Human Annotation for Reading Comprehension
Max Bartolo
A. Roberts
Johannes Welbl
Sebastian Riedel
Pontus Stenetorp
AAML
77
171
0
02 Feb 2020
How Can We Know What Language Models Know?
Zhengbao Jiang
Frank F. Xu
Jun Araki
Graham Neubig
KELM
99
1,396
0
28 Nov 2019
How Decoding Strategies Affect the Verifiability of Generated Text
Luca Massarelli
Fabio Petroni
Aleksandra Piktus
Myle Ott
Tim Rocktaschel
Vassilis Plachouras
Fabrizio Silvestri
Sebastian Riedel
60
50
0
09 Nov 2019
Not Enough Data? Deep Learning to the Rescue!
Ateret Anaby-Tavor
Boaz Carmeli
Esther Goldbraich
Amir Kantor
George Kour
Segev Shlomov
N. Tepper
Naama Zwerdling
52
369
0
08 Nov 2019
Q8BERT: Quantized 8Bit BERT
Ofir Zafrir
Guy Boudoukh
Peter Izsak
Moshe Wasserblat
MQ
57
502
0
14 Oct 2019
Structured Pruning of Large Language Models
Ziheng Wang
Jeremy Wohlwend
Tao Lei
38
283
0
10 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
136
7,437
0
02 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
272
6,420
0
26 Sep 2019
1
2
Next