Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,425 papers shown
Title
Data-driven Science and Machine Learning Methods in Laser-Plasma Physics
Andreas Döpp
C. Eberle
S. Howard
F. Irshad
Jinpu Lin
M. Streeter
AI4CE
79
72
0
30 Nov 2022
KRLS: Improving End-to-End Response Generation in Task Oriented Dialog with Reinforced Keywords Learning
Xiao Yu
Qingyang Wu
Kun Qian
Zhou Yu
OffRL
70
12
0
30 Nov 2022
Protein Language Models and Structure Prediction: Connection and Progression
Bozhen Hu
Jun Xia
Jiangbin Zheng
Cheng Tan
Yufei Huang
Yongjie Xu
Stan Z. Li
70
41
0
30 Nov 2022
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
Shaoyi Huang
Bowen Lei
Dongkuan Xu
Hongwu Peng
Yue Sun
Mimi Xie
Caiwen Ding
97
19
0
30 Nov 2022
CLIP-Nav: Using CLIP for Zero-Shot Vision-and-Language Navigation
Vishnu Sashank Dorbala
Gunnar Sigurdsson
Robinson Piramuthu
Jesse Thomason
Gaurav Sukhatme
LM&Ro
92
59
0
30 Nov 2022
Scalable Pathogen Detection from Next Generation DNA Sequencing with Deep Learning
S. Narayanan
Sathyanarayanan N. Aakur
Priyadharsini Ramamurthy
A. Bagavathi
V. Ramnath
A. Ramachandran
104
0
0
30 Nov 2022
Model Extraction Attack against Self-supervised Speech Models
Tsung-Yuan Hsu
Chen-An Li
Tung-Yu Wu
Hung-yi Lee
48
1
0
29 Nov 2022
MegaBlocks: Efficient Sparse Training with Mixture-of-Experts
Trevor Gale
Deepak Narayanan
C. Young
Matei A. Zaharia
MoE
81
109
0
29 Nov 2022
What learning algorithm is in-context learning? Investigations with linear models
Ekin Akyürek
Dale Schuurmans
Jacob Andreas
Tengyu Ma
Denny Zhou
125
493
0
28 Nov 2022
Is Conditional Generative Modeling all you need for Decision-Making?
Anurag Ajay
Yilun Du
Abhi Gupta
J. Tenenbaum
Tommi Jaakkola
Pulkit Agrawal
DiffM
162
408
0
28 Nov 2022
Action-GPT: Leveraging Large-scale Language Models for Improved and Generalized Action Generation
Sai Shashank Kalakonda
Shubh Maheshwari
Ravi Kiran Sarvadevabhatla
114
28
0
28 Nov 2022
On the Effectiveness of Parameter-Efficient Fine-Tuning
Z. Fu
Haoran Yang
Anthony Man-Cho So
Wai Lam
Lidong Bing
Nigel Collier
78
162
0
28 Nov 2022
A Survey on Conversational Search and Applications in Biomedicine
Naga Sai Krishna Adatrao
Gowtham Reddy Gadireddy
Jiho Noh
MedIm
64
4
0
28 Nov 2022
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes
Aviral Kumar
Rishabh Agarwal
Xinyang Geng
George Tucker
Sergey Levine
OffRL
131
51
0
28 Nov 2022
MGFN: Magnitude-Contrastive Glance-and-Focus Network for Weakly-Supervised Video Anomaly Detection
Yukang Chen
Zhengzhe Liu
Baoheng Zhang
W. Fok
Xiaojuan Qi
Yik-Chung Wu
119
129
0
28 Nov 2022
Continuous diffusion for categorical data
Sander Dieleman
Laurent Sartran
Arman Roshannai
Nikolay Savinov
Yaroslav Ganin
...
Conor Durkan
Curtis Hawthorne
Rémi Leblond
Will Grathwohl
J. Adler
DiffM
121
106
0
28 Nov 2022
DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models
Zhengfu He
Tianxiang Sun
Kuan-Chieh Wang
Xuanjing Huang
Xipeng Qiu
DiffM
VLM
101
131
0
28 Nov 2022
Fine-tuning language models to find agreement among humans with diverse preferences
Michiel A. Bakker
Martin Chadwick
Hannah R. Sheahan
Michael Henry Tessler
Lucy Campbell-Gillingham
...
Nat McAleese
Amelia Glaese
John Aslanides
M. Botvinick
Christopher Summerfield
ALM
110
237
0
28 Nov 2022
A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data Perspective
Yu Zhao
Huaming Du
Qing Li
Fuzhen Zhuang
Ji Liu
Gang Kou
Gang Kou
157
1
0
28 Nov 2022
Understanding BLOOM: An empirical study on diverse NLP tasks
Parag Dakle
Sai Krishna Rallabandi
Preethi Raghavan
AI4CE
89
4
0
27 Nov 2022
Latent SHAP: Toward Practical Human-Interpretable Explanations
Ron Bitton
Alon Malach
Amiel Meiseles
Satoru Momiyama
Toshinori Araki
Jun Furukawa
Yuval Elovici
A. Shabtai
FAtt
35
4
0
27 Nov 2022
Traditional Classification Neural Networks are Good Generators: They are Competitive with DDPMs and GANs
Guangrun Wang
Philip Torr
83
9
0
27 Nov 2022
How Crucial is Transformer in Decision Transformer?
Max Siebenborn
Boris Belousov
Junning Huang
Jan Peters
54
15
0
26 Nov 2022
Asymmetric Cross-Scale Alignment for Text-Based Person Search
Zhong Ji
Junhua Hu
Deyin Liu
Yuan Wu
Ye Zhao
106
46
0
26 Nov 2022
Deep Learning Training Procedure Augmentations
Cristian Simionescu
104
1
0
25 Nov 2022
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
Hiroki Furuta
Yusuke Iwasawa
Yutaka Matsuo
S. Gu
79
17
0
25 Nov 2022
Solving math word problems with process- and outcome-based feedback
J. Uesato
Nate Kushman
Ramana Kumar
Francis Song
Noah Y. Siegel
L. Wang
Antonia Creswell
G. Irving
I. Higgins
FaML
ReLM
AIMat
LRM
133
362
0
25 Nov 2022
GPT-3-driven pedagogical agents for training children's curious question-asking skills
Rania Abdelghani
Yen-Hsiang Wang
Xingdi Yuan
Tong Wang
Pauline Lucas
Hélene Sauzéon
Pierre-Yves Oudeyer
118
108
0
25 Nov 2022
Semantic Table Detection with LayoutLMv3
Ivan Silajev
Niels Victor
Phillip Mortimer
48
1
0
25 Nov 2022
Testing the effectiveness of saliency-based explainability in NLP using randomized survey-based experiments
Adel Rahimi
Shaurya Jain
FAtt
99
0
0
25 Nov 2022
Complementary Explanations for Effective In-Context Learning
Xi Ye
Srini Iyer
Asli Celikyilmaz
Ves Stoyanov
Greg Durrett
Ramakanth Pasunuru
ReLM
LRM
112
96
0
25 Nov 2022
TPA-Net: Generate A Dataset for Text to Physics-based Animation
Yuxing Qiu
Feng Gao
Minchen Li
Govind Thattai
Yin Yang
Chenfanfu Jiang
PINN
DiffM
VGen
58
0
0
25 Nov 2022
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Xupeng Miao
Yujie Wang
Youhe Jiang
Chunan Shi
Xiaonan Nie
Hailin Zhang
Tengjiao Wang
GNN
MoE
110
64
0
25 Nov 2022
Signed Binary Weight Networks
Sachit Kuhar
Alexey Tumanov
Judy Hoffman
MQ
86
1
0
25 Nov 2022
Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt
Zhichao Yang
Sunjae Kwon
Zonghai Yao
Hongfeng Yu
74
18
0
24 Nov 2022
Undesirable Biases in NLP: Addressing Challenges of Measurement
Oskar van der Wal
Dominik Bachmann
Alina Leidinger
L. Maanen
Willem H. Zuidema
K. Schulz
86
7
0
24 Nov 2022
Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes
Yiqiao Jin
Xiting Wang
Y. Hao
Yizhou Sun
Xing Xie
94
11
0
24 Nov 2022
TSGP: Two-Stage Generative Prompting for Unsupervised Commonsense Question Answering
Yueqing Sun
Yu Zhang
Le Qi
Qi Shi
ReLM
RALM
LRM
56
6
0
24 Nov 2022
Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto
Sherry Yang
Pieter Abbeel
Doina Precup
Igor Mordatch
Ofir Nachum
OffRL
56
5
0
23 Nov 2022
Automatic Generation of Socratic Subquestions for Teaching Math Word Problems
Kumar Shridhar
Jakub Macina
Mennatallah El-Assady
Tanmay Sinha
Manu Kapur
Mrinmaya Sachan
AIMat
98
49
0
23 Nov 2022
Learning Regularized Positional Encoding for Molecular Prediction
Xiang Gao
Weihao Gao
Wen Xiao
Zhirui Wang
Chong Wang
Liang Xiang
AI4CE
80
2
0
23 Nov 2022
Masked Autoencoding for Scalable and Generalizable Decision Making
Fangchen Liu
Hao Liu
Aditya Grover
Pieter Abbeel
OffRL
87
49
0
23 Nov 2022
AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Weiyan Shi
Emily Dinan
Adithya Renduchintala
Daniel Fried
Athul Paul Jacob
Zhou Yu
M. Lewis
AAML
111
2
0
22 Nov 2022
Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks
Wenhu Chen
Xueguang Ma
Xinyi Wang
William W. Cohen
ReLM
ReCod
LRM
244
829
0
22 Nov 2022
Retrieval-Augmented Multimodal Language Modeling
Michihiro Yasunaga
Armen Aghajanyan
Weijia Shi
Rich James
J. Leskovec
Percy Liang
M. Lewis
Luke Zettlemoyer
Wen-tau Yih
RALM
104
108
0
22 Nov 2022
HyperTuning: Toward Adapting Large Language Models without Back-propagation
Jason Phang
Yi Mao
Pengcheng He
Weizhu Chen
96
34
0
22 Nov 2022
Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
S. Bhattamishra
Arkil Patel
Varun Kanade
Phil Blunsom
116
49
0
22 Nov 2022
Coreference Resolution through a seq2seq Transition-Based System
Bernd Bohnet
Chris Alberti
Michael Collins
80
40
0
22 Nov 2022
Convolutional Neural Generative Coding: Scaling Predictive Coding to Natural Images
Alexander Ororbia
A. Mali
BDL
70
11
0
22 Nov 2022
Multitask Vision-Language Prompt Tuning
Sheng Shen
Shijia Yang
Tianjun Zhang
Bohan Zhai
Joseph E. Gonzalez
Kurt Keutzer
Trevor Darrell
VLM
VPVLM
115
53
0
21 Nov 2022
Previous
1
2
3
...
174
175
176
...
247
248
249
Next