Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,362 papers shown
Title
Few-Shot Learning of Compact Models via Task-Specific Meta Distillation
Yong Wu
Shekhor Chanda
M. Hosseinzadeh
Zhi Liu
Yang Wang
VLM
105
8
0
18 Oct 2022
Controllable Fake Document Infilling for Cyber Deception
Yibo Hu
Yu Lin
Eric Parolin
Latif Khan
Kevin W. Hamlen
66
8
0
18 Oct 2022
Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches and Future Directions
Qi Jia
Yizhu Liu
Siyu Ren
Kenny Q. Zhu
80
8
0
18 Oct 2022
Transfer learning with affine model transformation
Shunya Minami
Kenji Fukumizu
Yoshihiro Hayashi
Ryo Yoshida
69
1
0
18 Oct 2022
DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation
Hanqing Zhang
Dawei Song
89
37
0
18 Oct 2022
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
Zhiyuan Zhang
Lingjuan Lyu
Xingjun Ma
Chenguang Wang
Xu Sun
AAML
64
43
0
18 Oct 2022
Systematicity in GPT-3's Interpretation of Novel English Noun Compounds
Siyan Li
Riley Carlson
Christopher Potts
AI4CE
52
15
0
18 Oct 2022
Using Bottleneck Adapters to Identify Cancer in Clinical Notes under Low-Resource Constraints
Omid Rohanian
Hannah Jauncey
Mohammadmahdi Nouriborji
Vinod Kumar Chauhan
Bronner P. Gonccalves
Christiana Kartsonaki
Isaric Clinical Characterisation Group
L. Merson
David Clifton
59
7
0
17 Oct 2022
Deepfake Text Detection: Limitations and Opportunities
Jiameng Pu
Zain Sarwar
Sifat Muhammad Abdullah
A. Rehman
Yoonjin Kim
P. Bhattacharya
M. Javed
Bimal Viswanath
AAML
68
57
0
17 Oct 2022
Deep Bidirectional Language-Knowledge Graph Pretraining
Michihiro Yasunaga
Antoine Bosselut
Hongyu Ren
Xikun Zhang
Christopher D. Manning
Percy Liang
J. Leskovec
101
204
0
17 Oct 2022
Non-Contrastive Learning Meets Language-Image Pre-Training
Jinghao Zhou
Li Dong
Zhe Gan
Lijuan Wang
Furu Wei
VLM
CLIP
75
26
0
17 Oct 2022
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALM
ELM
LRM
ReLM
283
1,143
0
17 Oct 2022
Table-To-Text generation and pre-training with TabT5
Ewa Andrejczuk
Julian Martin Eisenschlos
Francesco Piccinno
Syrine Krichene
Yasemin Altun
LMTD
66
31
0
17 Oct 2022
Prompting GPT-3 To Be Reliable
Chenglei Si
Zhe Gan
Zhengyuan Yang
Shuohang Wang
Jianfeng Wang
Jordan L. Boyd-Graber
Lijuan Wang
KELM
LRM
113
303
0
17 Oct 2022
Flipped Classroom: Effective Teaching for Time Series Forecasting
P. Teutsch
Patrick Mäder
AI4TS
67
8
0
17 Oct 2022
Approximating Continuous Convolutions for Deep Network Compression
Theo W. Costain
V. Prisacariu
66
0
0
17 Oct 2022
Meta-Learning via Classifier(-free) Diffusion Guidance
Elvis Nava
Seijin Kobayashi
Yifei Yin
Robert K. Katzschmann
Benjamin Grewe
VLM
71
6
0
17 Oct 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training
A. M. H. Tiong
Junnan Li
Boyang Albert Li
Silvio Savarese
Guosheng Lin
MLLM
131
109
0
17 Oct 2022
Keep Me Updated! Memory Management in Long-term Conversations
Sanghwan Bae
Donghyun Kwak
Soyoung Kang
Min Young Lee
Sungdong Kim
Yuin Jeong
Hyeri Kim
Sang-Woo Lee
W. Park
Nako Sung
116
51
0
17 Oct 2022
RARR: Researching and Revising What Language Models Say, Using Language Models
Luyu Gao
Zhuyun Dai
Panupong Pasupat
Anthony Chen
Arun Tejasvi Chaganty
...
Vincent Zhao
Ni Lao
Hongrae Lee
Da-Cheng Juan
Kelvin Guu
HILM
KELM
133
260
0
17 Oct 2022
Continuous Pseudo-Labeling from the Start
Dan Berrebbi
R. Collobert
Samy Bengio
Navdeep Jaitly
Tatiana Likhomanenko
65
16
0
17 Oct 2022
Teacher Forcing Recovers Reward Functions for Text Generation
Yongchang Hao
Yuxin Liu
Lili Mou
OffRL
91
12
0
17 Oct 2022
Review learning: Real world validation of privacy preserving continual learning across medical institutions
Jaesung Yoo
Sung-Hyuk Choi
Yewon Yang
Suhyeon Kim
J. Choi
...
H. J. Joo
Dae-Jung Kim
R. Park
Hyeong-Jin Yoon
Kwangsoo Kim
OffRL
KELM
86
0
0
17 Oct 2022
Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores
Arsany Guirguis
Diana Petrescu
Florin Dinu
D. Quoc
Javier Picorel
R. Guerraoui
71
0
0
16 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELM
SSL
102
35
0
16 Oct 2022
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi R. Fung
Tuhin Chakraborty
Hao Guo
Owen Rambow
Smaranda Muresan
Heng Ji
83
42
0
16 Oct 2022
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
Ping Yang
Junjie Wang
Ruyi Gan
Xinyu Zhu
Lin Zhang
Ziwei Wu
Xinyu Gao
Jiaxing Zhang
Tetsuya Sakai
BDL
71
26
0
16 Oct 2022
Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding
Jiadong Wang
Wenkang Huang
Qiuhui Shi
Hongbin Wang
Minghui Qiu
Xiang Li
Ming Gao
KELM
VLM
90
19
0
16 Oct 2022
Model Criticism for Long-Form Text Generation
Yuntian Deng
Volodymyr Kuleshov
Alexander M. Rush
110
19
0
16 Oct 2022
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLM
MLLM
CLIP
231
3,520
0
16 Oct 2022
Temporal Word Meaning Disambiguation using TimeLMs
Mihir Godbole
Parth Dandavate
Aditya Kane
74
2
0
15 Oct 2022
TestAug: A Framework for Augmenting Capability-based NLP Tests
Guanqun Yang
Mirazul Haque
Qiaochu Song
Wei Yang
Xueqing Liu
ELM
61
0
0
14 Oct 2022
Injecting Domain Knowledge from Empirical Interatomic Potentials to Neural Networks for Predicting Material Properties
Zeren Shui
Daniel S. Karls
Mingjian Wen
Ilia Nikiforov
E. Tadmor
George Karypis
72
8
0
14 Oct 2022
PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Population
Tianqing Fang
Quyet V. Do
Hongming Zhang
Yangqiu Song
Ginny Wong
Simon See
LRM
104
11
0
14 Oct 2022
The Debate Over Understanding in AI's Large Language Models
Melanie Mitchell
D. Krakauer
ELM
155
222
0
14 Oct 2022
Sequential Learning Of Neural Networks for Prequential MDL
J. Bornschein
Yazhe Li
Marcus Hutter
AI4TS
55
7
0
14 Oct 2022
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks
Run Wang
Jixing Ren
Boheng Li
Tianyi She
Wenhui Zhang
Liming Fang
Jing Chen
Chao Shen
Lina Wang
WIGM
79
19
0
14 Oct 2022
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning
Louis Castricato
Alexander Havrilla
Shahbuland Matiana
Michael Pieler
Anbang Ye
Ian Yang
Spencer Frazier
Mark O. Riedl
95
13
0
14 Oct 2022
Extracting Cultural Commonsense Knowledge at Scale
Shrestha Ghosh
Simon Razniewski
A. Varde
Gerhard Weikum
106
66
0
14 Oct 2022
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
185
91
0
14 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Dianbo Sui
3DV
195
9
0
14 Oct 2022
Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization
Stone Tao
Xiaochen Li
Tongzhou Mu
Zhiao Huang
Yuzhe Qin
Hao Su
68
3
0
14 Oct 2022
Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values
Yejin Bang
Tiezheng Yu
Andrea Madotto
Zhaojiang Lin
Mona T. Diab
Pascale Fung
74
13
0
14 Oct 2022
A Survey of Parameters Associated with the Quality of Benchmarks in NLP
Swaroop Mishra
Anjana Arunkumar
Chris Bryan
Chitta Baral
105
1
0
14 Oct 2022
DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation
Mojtaba Valipour
Mehdi Rezagholizadeh
I. Kobyzev
A. Ghodsi
149
184
0
14 Oct 2022
Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge
Kosuke Nishida
Naoki Yoshinaga
Kyosuke Nishida
86
2
0
14 Oct 2022
Can Language Representation Models Think in Bets?
Zhi–Bin Tang
Mayank Kejriwal
51
6
0
14 Oct 2022
"John is 50 years old, can his son be 65?" Evaluating NLP Models' Understanding of Feasibility
Himanshu Gupta
Neeraj Varshney
Swaroop Mishra
Kuntal Kumar Pal
Saurabh Arjun Sawant
Kevin Scaria
Siddharth Goyal
Chitta Baral
ELM
100
14
0
14 Oct 2022
Transparency Helps Reveal When Language Models Learn Meaning
Zhaofeng Wu
William Merrill
Hao Peng
Iz Beltagy
Noah A. Smith
59
10
0
14 Oct 2022
Mind the Labels: Describing Relations in Knowledge Graphs With Pretrained Models
Zdeněk Kasner
Ioannis Konstas
Ondrej Dusek
78
6
0
13 Oct 2022
Previous
1
2
3
...
181
182
183
...
246
247
248
Next