Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 11,513 papers shown
Title
Exploring the limits of Concurrency in ML Training on Google TPUs
Sameer Kumar
James Bradbury
C. Young
Yu Emma Wang
Anselm Levskaya
...
Tao Wang
Tayo Oguntebi
Yazhou Zu
Yuanzhong Xu
Andy Swing
BDL
AIMat
MoE
LRM
25
27
0
07 Nov 2020
Machine Generation and Detection of Arabic Manipulated and Fake News
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
Tariq Alhindi
H. Cavusoglu
DeLMO
24
50
0
05 Nov 2020
Detecting Hallucinated Content in Conditional Neural Sequence Generation
Chunting Zhou
Graham Neubig
Jiatao Gu
Mona T. Diab
P. Guzmán
Luke Zettlemoyer
Marjan Ghazvininejad
HILM
39
195
0
05 Nov 2020
Rearrangement: A Challenge for Embodied AI
Dhruv Batra
Angel X. Chang
Sonia Chernova
Andrew J. Davison
Jia Deng
...
Jitendra Malik
Igor Mordatch
Roozbeh Mottaghi
Manolis Savva
Hao Su
LM&Ro
38
217
0
03 Nov 2020
Emergent Communication Pretraining for Few-Shot Machine Translation
Yaoyiran Li
Edoardo Ponti
Ivan Vulić
Anna Korhonen
25
19
0
02 Nov 2020
Melody-Conditioned Lyrics Generation with SeqGANs
Yihao Chen
Alexander Lerch
GAN
MGen
32
29
0
28 Oct 2020
Scaling Laws for Autoregressive Generative Modeling
T. Henighan
Jared Kaplan
Mor Katz
Mark Chen
Christopher Hesse
...
Nick Ryder
Daniel M. Ziegler
John Schulman
Dario Amodei
Sam McCandlish
53
408
0
28 Oct 2020
A Statistical Framework for Low-bitwidth Training of Deep Neural Networks
Jianfei Chen
Yujie Gai
Z. Yao
Michael W. Mahoney
Joseph E. Gonzalez
MQ
20
58
0
27 Oct 2020
Dutch Humor Detection by Generating Negative Examples
Thomas Winters
Pieter Delobelle
19
10
0
26 Oct 2020
Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification
Timo Schick
Helmut Schmid
Hinrich Schütze
VLM
19
206
0
26 Oct 2020
Pre-trained Summarization Distillation
Sam Shleifer
Alexander M. Rush
26
98
0
24 Oct 2020
Text Editing by Command
Felix Faltings
Michel Galley
Gerold Hintz
Chris Brockett
Chris Quirk
Jianfeng Gao
Bill Dolan
KELM
147
37
0
24 Oct 2020
Rethinking embedding coupling in pre-trained language models
Hyung Won Chung
Thibault Févry
Henry Tsai
Melvin Johnson
Sebastian Ruder
95
142
0
24 Oct 2020
Text Style Transfer: A Review and Experimental Evaluation
Zhiqiang Hu
Roy Ka-wei Lee
Charu C. Aggarwal
Aston Zhang
AI4TS
42
26
0
24 Oct 2020
An Evaluation Protocol for Generative Conversational Systems
Seolhwa Lee
Heuiseok Lim
Jo˜ao Sedoc
ELM
35
10
0
24 Oct 2020
Learning to Recognize Dialect Features
Dorottya Demszky
D. Sharma
J. Clark
Vinodkumar Prabhakaran
Jacob Eisenstein
123
38
0
23 Oct 2020
Long Document Ranking with Query-Directed Sparse Transformer
Jyun-Yu Jiang
Chenyan Xiong
Chia-Jung Lee
Wei Wang
33
25
0
23 Oct 2020
On the Transformer Growth for Progressive BERT Training
Xiaotao Gu
Liyuan Liu
Hongkun Yu
Jing Li
Chong Chen
Jiawei Han
VLM
69
51
0
23 Oct 2020
The Turking Test: Can Language Models Understand Instructions?
Avia Efrat
Omer Levy
ELM
LRM
34
96
0
22 Oct 2020
Language Models are Open Knowledge Graphs
Chenguang Wang
Xiao Liu
D. Song
SSL
KELM
26
135
0
22 Oct 2020
Limitations of Autoregressive Models and Their Alternatives
Chu-cheng Lin
Aaron Jaech
Xin Li
Matthew R. Gormley
Jason Eisner
29
58
0
22 Oct 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
41
39,428
0
22 Oct 2020
AdapterDrop: On the Efficiency of Adapters in Transformers
Andreas Rucklé
Gregor Geigle
Max Glockner
Tilman Beck
Jonas Pfeiffer
Nils Reimers
Iryna Gurevych
57
255
0
22 Oct 2020
Is Retriever Merely an Approximator of Reader?
Sohee Yang
Minjoon Seo
RALM
24
39
0
21 Oct 2020
Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation
Laurel J. Orr
Megan Leszczynski
Simran Arora
Sen Wu
Neel Guha
Xiao Ling
Christopher Ré
143
48
0
20 Oct 2020
Local Knowledge Powered Conversational Agents
Sashank Santhanam
Ming-Yu Liu
Raul Puri
M. Shoeybi
M. Patwary
Bryan Catanzaro
29
4
0
20 Oct 2020
Neural Language Modeling for Contextualized Temporal Graph Generation
Aman Madaan
Yiming Yang
45
20
0
20 Oct 2020
Optimism in the Face of Adversity: Understanding and Improving Deep Learning through Adversarial Robustness
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
AAML
31
48
0
19 Oct 2020
Consistency and Coherency Enhanced Story Generation
Wei Wang
Piji Li
Haitao Zheng
30
11
0
17 Oct 2020
For self-supervised learning, Rationality implies generalization, provably
Yamini Bansal
Gal Kaplun
Boaz Barak
OOD
SSL
60
22
0
16 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Preetum Nakkiran
Behnam Neyshabur
Hanie Sedghi
OffRL
29
11
0
16 Oct 2020
Masked Contrastive Representation Learning for Reinforcement Learning
Jinhua Zhu
Yingce Xia
Lijun Wu
Jiajun Deng
Wen-gang Zhou
Tao Qin
Houqiang Li
SSL
OffRL
34
55
0
15 Oct 2020
Neural Databases
James Thorne
Majid Yazdani
Marzieh Saeidi
Fabrizio Silvestri
Sebastian Riedel
A. Halevy
NAI
34
9
0
14 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
244
612
0
13 Oct 2020
MixCo: Mix-up Contrastive Learning for Visual Representation
Sungnyun Kim
Gihun Lee
Sangmin Bae
Seyoung Yun
SSL
112
80
0
13 Oct 2020
Improving Text Generation with Student-Forcing Optimal Transport
Guoyin Wang
Chunyuan Li
Jianqiao Li
Hao Fu
Yuh-Chen Lin
...
Ruiyi Zhang
Wenlin Wang
Dinghan Shen
Qian Yang
Lawrence Carin
OT
30
17
0
12 Oct 2020
Neural, Symbolic and Neural-Symbolic Reasoning on Knowledge Graphs
Jing Zhang
Bo Chen
Lingxi Zhang
Xirui Ke
Haipeng Ding
NAI
40
3
0
12 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
31
44
0
11 Oct 2020
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
31
87
0
07 Oct 2020
A ground-truth dataset and classification model for detecting bots in GitHub issue and PR comments
M. Golzadeh
Alexandre Decan
Damien Legay
T. Mens
31
73
0
07 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Junwen Bai
Weiran Wang
Yingbo Zhou
Caiming Xiong
SSL
AI4TS
27
12
0
07 Oct 2020
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective
Wei Ping
Shuohang Wang
Yu Cheng
Zhe Gan
R. Jia
Bo-wen Li
Jingjing Liu
AAML
46
113
0
05 Oct 2020
Local Label Point Correction for Edge Detection of Overlapping Cervical Cells
Jiawei Liu
Huijie Fan
Qiang Wang
Wentao Li
Yandong Tang
Danbo Wang
Mingyi Zhou
Li Chen
13
9
0
05 Oct 2020
PMI-Masking: Principled masking of correlated spans
Yoav Levine
Barak Lenz
Opher Lieber
Omri Abend
Kevin Leyton-Brown
Moshe Tennenholtz
Y. Shoham
22
72
0
05 Oct 2020
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models
Thuy-Trang Vu
Dinh Q. Phung
Gholamreza Haffari
19
24
0
05 Oct 2020
Data-Efficient Pretraining via Contrastive Self-Supervision
Nils Rethmeier
Isabelle Augenstein
28
20
0
02 Oct 2020
Where Does Trust Break Down? A Quantitative Trust Analysis of Deep Neural Networks via Trust Matrix and Conditional Trust Densities
Andrew Hryniowski
Xiao Yu Wang
A. Wong
25
10
0
30 Sep 2020
Understanding Human Intelligence through Human Limitations
Thomas Griffiths
28
64
0
29 Sep 2020
Utility is in the Eye of the User: A Critique of NLP Leaderboards
Kawin Ethayarajh
Dan Jurafsky
ELM
24
51
0
29 Sep 2020
From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction Using Social Media Data
Weiran Yao
Sean Qian
19
47
0
29 Sep 2020
Previous
1
2
3
...
227
228
229
230
231
Next