Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,277 papers shown
Title
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Shitao Xiao
Zheng Liu
Yingxia Shao
Tao Di
Xing Xie
VLM
AIFin
186
42
0
18 Feb 2021
Composable Generative Models
Johan Leduc
Nicolas Grislain
SyDa
84
4
0
18 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
461
1,143
0
17 Feb 2021
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release
Liam H. Fowl
Ping Yeh-Chiang
Micah Goldblum
Jonas Geiping
Arpit Bansal
W. Czaja
Tom Goldstein
80
43
0
16 Feb 2021
Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
Itay Hubara
Brian Chmiel
Moshe Island
Ron Banner
S. Naor
Daniel Soudry
121
119
0
16 Feb 2021
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training
Chen Zhu
Renkun Ni
Zheng Xu
Kezhi Kong
Wenjie Huang
Tom Goldstein
ODL
119
56
0
16 Feb 2021
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. O. Topal
Anil Bas
Imke van Heerden
LLMAG
AI4CE
73
91
0
16 Feb 2021
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan Li
Siyuan Zhuang
Shiyuan Guo
Danyang Zhuo
Hao Zhang
Basel Alomair
Ion Stoica
MoE
95
125
0
16 Feb 2021
Training Larger Networks for Deep Reinforcement Learning
Keita Ota
Devesh K. Jha
Asako Kanezaki
OffRL
97
40
0
16 Feb 2021
Translational Equivariance in Kernelizable Attention
Max Horn
Kumar Shridhar
Elrich Groenewald
Philipp F. M. Baumann
54
7
0
15 Feb 2021
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Laria Reynolds
Kyle McDonell
122
928
0
15 Feb 2021
Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
Kento Nozawa
Issei Sato
SSL
132
46
0
13 Feb 2021
Explaining Neural Scaling Laws
Yasaman Bahri
Ethan Dyer
Jared Kaplan
Jaehoon Lee
Utkarsh Sharma
91
270
0
12 Feb 2021
InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model
Sidi Lu
Tao Meng
Nanyun Peng
104
13
0
12 Feb 2021
Contrastive Unsupervised Learning for Speech Emotion Recognition
Mao Li
Bo Yang
Joshua Levy
A. Stolcke
Viktor Rozgic
Spyros Matsoukas
C. Papayiannis
Daniel Bone
Chao Wang
SSL
100
49
0
12 Feb 2021
Proof Artifact Co-training for Theorem Proving with Language Models
Jesse Michael Han
Jason M. Rute
Yuhuai Wu
Edward W. Ayers
Stanislas Polu
AIMat
107
127
0
11 Feb 2021
Cross-Domain Multi-Task Learning for Sequential Sentence Classification in Research Papers
Arthur Brack
Anett Hoppe
Pascal Buschermöhle
Ralph Ewerth
104
18
0
11 Feb 2021
Representation Matters: Offline Pretraining for Sequential Decision Making
Mengjiao Yang
Ofir Nachum
SSL
OffRL
93
119
0
11 Feb 2021
Energy-Harvesting Distributed Machine Learning
Başak Güler
Aylin Yener
FedML
60
15
0
10 Feb 2021
Generating Synthetic Text Data to Evaluate Causal Inference Methods
Zach Wood-Doughty
I. Shpitser
Mark Dredze
SyDa
CML
68
11
0
10 Feb 2021
NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting
Kai Chen
Guang Chen
Dan Xu
Lijun Zhang
Yuyao Huang
Alois C. Knoll
AI4TS
48
22
0
10 Feb 2021
Language Models for Lexical Inference in Context
Martin Schmitt
Hinrich Schütze
78
14
0
10 Feb 2021
Dynamic Neural Networks: A Survey
Yizeng Han
Gao Huang
Shiji Song
Le Yang
Honghui Wang
Yulin Wang
3DH
AI4TS
AI4CE
130
658
0
09 Feb 2021
Improving Scene Graph Classification by Exploiting Knowledge from Texts
Sahand Sharifzadeh
Sina Moayed Baharlou
Martin Schmitt
Hinrich Schutze
Volker Tresp
52
19
0
09 Feb 2021
A Framework for Auditing Data Center Energy Usage and Mitigating Environmental Footprint
Justin Gould
24
1
0
08 Feb 2021
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
P. Ranade
Aritran Piplai
Sudip Mittal
A. Joshi
Tim Finin
110
71
0
08 Feb 2021
The Singleton Fallacy: Why Current Critiques of Language Models Miss the Point
Magnus Sahlgren
F. Carlsson
55
28
0
08 Feb 2021
Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds
Joel Bachmann
Kenneth Blomqvist
Julian Förster
Roland Siegwart
3DPC
52
3
0
08 Feb 2021
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch
Aojun Zhou
Yukun Ma
Junnan Zhu
Jianbo Liu
Zhijie Zhang
Kun Yuan
Wenxiu Sun
Hongsheng Li
215
250
0
08 Feb 2021
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention
Yunyang Xiong
Zhanpeng Zeng
Rudrasis Chakraborty
Mingxing Tan
G. Fung
Yin Li
Vikas Singh
106
526
0
07 Feb 2021
Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word Understanding of Language Models
Lutfi Kerem Senel
Hinrich Schütze
45
5
0
06 Feb 2021
Ownership Verification of DNN Architectures via Hardware Cache Side Channels
Xiaoxuan Lou
Shangwei Guo
Jiwei Li
Tianwei Zhang
68
11
0
06 Feb 2021
Symbolic Behaviour in Artificial Intelligence
Adam Santoro
Andrew Kyle Lampinen
Kory W. Mathewson
Timothy Lillicrap
David Raposo
79
34
0
05 Feb 2021
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers
Chaoyang He
Shen Li
Mahdi Soltanolkotabi
Salman Avestimehr
65
29
0
05 Feb 2021
DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep Learning
Kelly Kostopoulou
Hang Xu
Aritra Dutta
Xin Li
A. Ntoulas
Panos Kalnis
41
7
0
05 Feb 2021
Understanding Emails and Drafting Responses -- An Approach Using GPT-3
Jonas Thiergart
Stefan Huber
Thomas Übellacker
38
25
0
05 Feb 2021
Adaptive Semiparametric Language Models
Dani Yogatama
Cyprien de Masson dÁutume
Lingpeng Kong
KELM
RALM
99
100
0
04 Feb 2021
Embodied Intelligence via Learning and Evolution
Agrim Gupta
Silvio Savarese
Surya Ganguli
Li Fei-Fei
AI4CE
102
254
0
03 Feb 2021
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data
Peter Hase
Joey Tianyi Zhou
XAI
129
89
0
03 Feb 2021
Fairness for Unobserved Characteristics: Insights from Technological Impacts on Queer Communities
Nenad Tomašev
Kevin R. McKee
Jackie Kay
Shakir Mohamed
FaML
85
89
0
03 Feb 2021
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
...
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
VLM
140
219
0
03 Feb 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
151
364
0
02 Feb 2021
Scaling Laws for Transfer
Danny Hernandez
Jared Kaplan
T. Henighan
Sam McCandlish
97
251
0
02 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
290
366
0
01 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
335
371
0
01 Feb 2021
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
Nora Kassner
Philipp Dufter
Hinrich Schütze
78
141
0
01 Feb 2021
Neural Network architectures to classify emotions in Indian Classical Music
U. Sarkar
Sayan Nag
Medha Basu
Archi Banerjee
S. Sanyal
R. Sengupta
D. Ghosh
16
1
0
01 Feb 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
148
117
0
31 Jan 2021
Can We Automate Scientific Reviewing?
Weizhe Yuan
Pengfei Liu
Graham Neubig
156
90
0
30 Jan 2021
Challenges in Automated Debiasing for Toxic Language Detection
Xuhui Zhou
Maarten Sap
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
78
142
0
29 Jan 2021
Previous
1
2
3
...
236
237
238
...
244
245
246
Next