Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
32 / 11,082 papers shown
Title
Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems
Zehao Lin
Shaobo Cui
Guodun Li
Xiaoming Kang
Feng Ji
Feng-Lin Li
Zhongzhou Zhao
Haiqing Chen
Yin Zhang
34
1
0
27 May 2020
Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction
L. Rasmy
Yang Xiang
Z. Xie
Cui Tao
Degui Zhi
AI4MH
LM&MA
24
656
0
22 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Victor Sanh
Thomas Wolf
Alexander M. Rush
32
468
0
15 May 2020
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
Tal Linzen
220
189
0
03 May 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAML
XAI
41
371
0
30 Apr 2020
Deep Learning for Time Series Forecasting: Tutorial and Literature Survey
Konstantinos Benidis
Syama Sundar Rangapuram
Valentin Flunkert
Bernie Wang
Danielle C. Maddix
...
David Salinas
Lorenzo Stella
François-Xavier Aubet
Laurent Callot
Tim Januschowski
AI4TS
25
176
0
21 Apr 2020
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
21
351
0
21 Apr 2020
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Chunyuan Li
Xiang Gao
Yuan Li
Baolin Peng
Xiujun Li
Yizhe Zhang
Jianfeng Gao
SSL
DRL
32
181
0
05 Apr 2020
A Low-cost Fault Corrector for Deep Neural Networks through Range Restriction
Zitao Chen
Guanpeng Li
Karthik Pattabiraman
AAML
AI4CE
28
17
0
30 Mar 2020
Machine learning as a model for cultural learning: Teaching an algorithm what it means to be fat
Alina Arseniev-Koehler
J. Foster
43
46
0
24 Mar 2020
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MA
VLM
243
1,452
0
18 Mar 2020
Iterative Averaging in the Quest for Best Test Error
Diego Granziol
Xingchen Wan
Samuel Albanie
Stephen J. Roberts
10
3
0
02 Mar 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
17
247
0
29 Feb 2020
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
Max Ryabinin
Anton I. Gusev
FedML
27
48
0
10 Feb 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,589
0
21 Jan 2020
Language Models Are An Effective Patient Representation Learning Technique For Electronic Health Record Data
E. Steinberg
Kenneth Jung
Jason Alan Fries
Conor K. Corbin
Stephen R. Pfohl
N. Shah
26
103
0
06 Jan 2020
Fast and energy-efficient neuromorphic deep learning with first-spike times
Julian Goltz
Laura Kriener
A. Baumbach
Sebastian Billaudelle
O. Breitwieser
...
Á. F. Kungl
Walter Senn
Johannes Schemmel
K. Meier
Mihai A. Petrovici
35
126
0
24 Dec 2019
Blockwise Self-Attention for Long Document Understanding
J. Qiu
Hao Ma
Omer Levy
Scott Yih
Sinong Wang
Jie Tang
11
251
0
07 Nov 2019
Discovering the Compositional Structure of Vector Representations with Role Learning Networks
Paul Soulos
R. Thomas McCoy
Tal Linzen
P. Smolensky
CoGe
29
43
0
21 Oct 2019
Demon: Improved Neural Network Training with Momentum Decay
John Chen
Cameron R. Wolfe
Zhaoqi Li
Anastasios Kyrillidis
ODL
24
15
0
11 Oct 2019
On the adequacy of untuned warmup for adaptive optimization
Jerry Ma
Denis Yarats
59
70
0
09 Oct 2019
Soft-Label Dataset Distillation and Text Dataset Distillation
Ilia Sucholutsky
Matthias Schonlau
DD
33
131
0
06 Oct 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
301
1,610
0
18 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,826
0
17 Sep 2019
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
223
618
0
03 Sep 2019
Trends in Integration of Vision and Language Research: A Survey of Tasks, Datasets, and Methods
Aditya Mogadala
M. Kalimuthu
Dietrich Klakow
VLM
20
132
0
22 Jul 2019
An Information Theoretic Interpretation to Deep Neural Networks
Shao-Lun Huang
Xiangxiang Xu
Lizhong Zheng
G. Wornell
FAtt
22
41
0
16 May 2019
Investigating Antigram Behaviour using Distributional Semantics
Saptarshi Sengupta
11
0
0
15 Jan 2019
O2A: One-shot Observational learning with Action vectors
Leo Pauly
Wisdom C. Agboh
David C. Hogg
R. Fuentes
54
9
0
17 Oct 2018
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
383
11,700
0
09 Mar 2017
Quantifying the probable approximation error of probabilistic inference programs
Marco F. Cusumano-Towner
Vikash K. Mansinghka
33
7
0
31 May 2016
Previous
1
2
3
...
220
221
222