Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,183 papers shown
Title
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks
H. H. Mao
BDL
SSL
72
50
0
01 Jul 2020
Computing Conceptual Distances between Breast Cancer Screening Guidelines: An Implementation of a Near-Peer Epistemic Model of Medical Disagreement
Hossein Hematialam
Luciana D. Garbayo
Seethalakshmi Gopalakrishnan
Wlodek Zadrozny
52
1
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
133
135
0
30 Jun 2020
Technical Report: Auxiliary Tuning and its Application to Conditional Text Generation
Yoel Zeldes
Dan Padnos
Or Sharir
Barak Peleg
123
19
0
30 Jun 2020
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning
Siqi Bao
H. He
Fan Wang
Hua Wu
Haifeng Wang
Wenquan Wu
Zhen Guo
Zhibin Liu
Xinchao Xu
92
138
0
30 Jun 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
149
1,195
0
30 Jun 2020
Natural Backdoor Attack on Text Data
Lichao Sun
SILM
81
41
0
29 Jun 2020
Answering Questions on COVID-19 in Real-Time
Jinhyuk Lee
Sean S. Yi
Minbyul Jeong
Mujeen Sung
Wonjin Yoon
Yonghwa Choi
Miyoung Ko
Jaewoo Kang
80
43
0
29 Jun 2020
Knowledge-Aware Language Model Pretraining
Corby Rosset
Chenyan Xiong
M. Phan
Xia Song
Paul N. Bennett
Saurabh Tiwary
KELM
94
82
0
29 Jun 2020
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene Kharitonov
Rahma Chaabouni
77
27
0
26 Jun 2020
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
150
389
0
26 Jun 2020
Inference with Artificial Neural Networks on Analog Neuromorphic Hardware
Johannes Weis
Philipp Spilger
Sebastian Billaudelle
Yannik Stradmann
Arne Emmel
...
V. Karasenko
Mitja Kleider
Christian Mauch
Korbinian Schreiber
Johannes Schemmel
63
10
0
23 Jun 2020
Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures
Julien Launay
Iacopo Poli
Franccois Boniface
Florent Krzakala
125
64
0
23 Jun 2020
A Survey on Deep Learning for Localization and Mapping: Towards the Age of Spatial Machine Intelligence
Changhao Chen
Binghai Wang
Chris Xiaoxuan Lu
A. Trigoni
Andrew Markham
101
136
0
22 Jun 2020
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
126
46
0
22 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
109
2
0
21 Jun 2020
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
F. Iandola
Albert Eaton Shaw
Ravi Krishna
Kurt Keutzer
VLM
90
127
0
19 Jun 2020
A Qualitative Evaluation of Language Models on Automatic Question-Answering for COVID-19
David Oniani
Yanshan Wang
62
32
0
19 Jun 2020
Zero-Shot Learning with Common Sense Knowledge Graphs
Nihal V. Nayak
Stephen H. Bach
VLM
123
37
0
18 Jun 2020
On the Predictability of Pruning Across Scales
Jonathan S. Rosenfeld
Jonathan Frankle
Michael Carbin
Nir Shavit
69
38
0
18 Jun 2020
What Do Neural Networks Learn When Trained With Random Labels?
Hartmut Maennel
Ibrahim Alabdulmohsin
Ilya O. Tolstikhin
R. Baldock
Olivier Bousquet
Sylvain Gelly
Daniel Keysers
FedML
165
89
0
18 Jun 2020
Neural Anisotropy Directions
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
94
16
0
17 Jun 2020
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
92
94
0
17 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
92
218
0
16 Jun 2020
Surrogate gradients for analog neuromorphic computing
Benjamin Cramer
Sebastian Billaudelle
Simeon Kanya
Aron Leibfried
Andreas Grubl
...
Korbinian Schreiber
Yannik Stradmann
Johannes Weis
Johannes Schemmel
Friedemann Zenke
79
107
0
12 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
167
437
0
11 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
226
1,720
0
08 Jun 2020
The Lipschitz Constant of Self-Attention
Hyunjik Kim
George Papamakarios
A. Mnih
90
146
0
08 Jun 2020
BERT Loses Patience: Fast and Robust Inference with Early Exit
Wangchunshu Zhou
Canwen Xu
Tao Ge
Julian McAuley
Ke Xu
Furu Wei
60
343
0
07 Jun 2020
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases
W. Guo
Aylin Caliskan
57
245
0
06 Jun 2020
Challenges and Thrills of Legal Arguments
Anurag Pallaprolu
Radha Vaidya
Aditya Swaroop Attawar
LRM
13
0
0
06 Jun 2020
An Overview of Neural Network Compression
James OÑeill
AI4CE
148
99
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
173
2,766
0
05 Jun 2020
CoCon: A Self-Supervised Approach for Controlled Text Generation
Alvin Chan
Yew-Soon Ong
B. Pung
Aston Zhang
Jie Fu
79
86
0
05 Jun 2020
MLE-guided parameter search for task loss minimization in neural sequence modeling
Sean Welleck
Kyunghyun Cho
59
10
0
04 Jun 2020
Serving DNNs like Clockwork: Performance Predictability from the Bottom Up
A. Gujarati
Reza Karimi
Safya Alzayat
Wei Hao
Antoine Kaufmann
Ymir Vigfusson
Jonathan Mace
109
285
0
03 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
91
75
0
31 May 2020
Transferring Inductive Biases through Knowledge Distillation
Samira Abnar
Mostafa Dehghani
Willem H. Zuidema
90
60
0
31 May 2020
Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems
Zehao Lin
Shaobo Cui
Guodun Li
Xiaoming Kang
Feng Ji
Feng-Lin Li
Zhongzhou Zhao
Haiqing Chen
Yin Zhang
60
2
0
27 May 2020
Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction
L. Rasmy
Yang Xiang
Z. Xie
Cui Tao
Degui Zhi
AI4MH
LM&MA
101
698
0
22 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Victor Sanh
Thomas Wolf
Alexander M. Rush
79
487
0
15 May 2020
Tailoring and Evaluating the Wikipedia for in-Domain Comparable Corpora Extraction
C. España-Bonet
Alberto Barrón-Cedeño
Lluís Marquez
22
9
0
03 May 2020
Reinforcement Learning with Augmented Data
Michael Laskin
Kimin Lee
Adam Stooke
Lerrel Pinto
Pieter Abbeel
A. Srinivas
OffRL
122
660
0
30 Apr 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAML
XAI
111
379
0
30 Apr 2020
Deep Learning for Time Series Forecasting: Tutorial and Literature Survey
Konstantinos Benidis
Syama Sundar Rangapuram
Valentin Flunkert
Bernie Wang
Danielle C. Maddix
...
David Salinas
Lorenzo Stella
François-Xavier Aubet
Laurent Callot
Tim Januschowski
AI4TS
99
200
0
21 Apr 2020
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
102
360
0
21 Apr 2020
Improving Readability for Automatic Speech Recognition Transcription
Junwei Liao
Sefik Emre Eskimez
Liyang Lu
Yu Shi
Ming Gong
Linjun Shou
Hong Qu
Michael Zeng
67
56
0
09 Apr 2020
Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation
Dana Ruiter
Josef van Genabith
C. España-Bonet
SSL
51
3
0
07 Apr 2020
Deep Learning Based Text Classification: A Comprehensive Review
Shervin Minaee
Nal Kalchbrenner
Min Zhang
Narjes Nikzad
M. Asgari-Chenaghlu
Jianfeng Gao
AILaw
VLM
AI4TS
116
1,113
0
06 Apr 2020
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Chunyuan Li
Xiang Gao
Yuan Li
Baolin Peng
Xiujun Li
Yizhe Zhang
Jianfeng Gao
SSL
DRL
86
182
0
05 Apr 2020
Previous
1
2
3
...
242
243
244
Next