ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,183 papers shown
Title
A Survey on Self-supervised Pre-training for Sequential Transfer
  Learning in Neural Networks
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks
H. H. Mao
BDLSSL
72
50
0
01 Jul 2020
Computing Conceptual Distances between Breast Cancer Screening
  Guidelines: An Implementation of a Near-Peer Epistemic Model of Medical
  Disagreement
Computing Conceptual Distances between Breast Cancer Screening Guidelines: An Implementation of a Near-Peer Epistemic Model of Medical Disagreement
Hossein Hematialam
Luciana D. Garbayo
Seethalakshmi Gopalakrishnan
Wlodek Zadrozny
52
1
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
133
135
0
30 Jun 2020
Technical Report: Auxiliary Tuning and its Application to Conditional
  Text Generation
Technical Report: Auxiliary Tuning and its Application to Conditional Text Generation
Yoel Zeldes
Dan Padnos
Or Sharir
Barak Peleg
123
19
0
30 Jun 2020
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning
PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning
Siqi Bao
H. He
Fan Wang
Hua Wu
Haifeng Wang
Wenquan Wu
Zhen Guo
Zhibin Liu
Xinchao Xu
92
138
0
30 Jun 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic
  Sharding
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
149
1,195
0
30 Jun 2020
Natural Backdoor Attack on Text Data
Natural Backdoor Attack on Text Data
Lichao Sun
SILM
81
41
0
29 Jun 2020
Answering Questions on COVID-19 in Real-Time
Answering Questions on COVID-19 in Real-Time
Jinhyuk Lee
Sean S. Yi
Minbyul Jeong
Mujeen Sung
Wonjin Yoon
Yonghwa Choi
Miyoung Ko
Jaewoo Kang
80
43
0
29 Jun 2020
Knowledge-Aware Language Model Pretraining
Knowledge-Aware Language Model Pretraining
Corby Rosset
Chenyan Xiong
M. Phan
Xia Song
Paul N. Bennett
Saurabh Tiwary
KELM
94
82
0
29 Jun 2020
What they do when in doubt: a study of inductive biases in seq2seq
  learners
What they do when in doubt: a study of inductive biases in seq2seq learners
Eugene Kharitonov
Rahma Chaabouni
77
27
0
26 Jun 2020
Evaluation of Text Generation: A Survey
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELMLM&MA
150
389
0
26 Jun 2020
Inference with Artificial Neural Networks on Analog Neuromorphic
  Hardware
Inference with Artificial Neural Networks on Analog Neuromorphic Hardware
Johannes Weis
Philipp Spilger
Sebastian Billaudelle
Yannik Stradmann
Arne Emmel
...
V. Karasenko
Mitja Kleider
Christian Mauch
Korbinian Schreiber
Johannes Schemmel
63
10
0
23 Jun 2020
Direct Feedback Alignment Scales to Modern Deep Learning Tasks and
  Architectures
Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures
Julien Launay
Iacopo Poli
Franccois Boniface
Florent Krzakala
125
64
0
23 Jun 2020
A Survey on Deep Learning for Localization and Mapping: Towards the Age
  of Spatial Machine Intelligence
A Survey on Deep Learning for Localization and Mapping: Towards the Age of Spatial Machine Intelligence
Changhao Chen
Binghai Wang
Chris Xiaoxuan Lu
A. Trigoni
Andrew Markham
101
136
0
22 Jun 2020
The Depth-to-Width Interplay in Self-Attention
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
126
46
0
22 Jun 2020
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of
  Gradients
MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients
Chenfei Zhu
Yu Cheng
Zhe Gan
Furong Huang
Jingjing Liu
Tom Goldstein
ODL
109
2
0
21 Jun 2020
SqueezeBERT: What can computer vision teach NLP about efficient neural
  networks?
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
F. Iandola
Albert Eaton Shaw
Ravi Krishna
Kurt Keutzer
VLM
90
127
0
19 Jun 2020
A Qualitative Evaluation of Language Models on Automatic
  Question-Answering for COVID-19
A Qualitative Evaluation of Language Models on Automatic Question-Answering for COVID-19
David Oniani
Yanshan Wang
62
32
0
19 Jun 2020
Zero-Shot Learning with Common Sense Knowledge Graphs
Zero-Shot Learning with Common Sense Knowledge Graphs
Nihal V. Nayak
Stephen H. Bach
VLM
123
37
0
18 Jun 2020
On the Predictability of Pruning Across Scales
On the Predictability of Pruning Across Scales
Jonathan S. Rosenfeld
Jonathan Frankle
Michael Carbin
Nir Shavit
69
38
0
18 Jun 2020
What Do Neural Networks Learn When Trained With Random Labels?
What Do Neural Networks Learn When Trained With Random Labels?
Hartmut Maennel
Ibrahim Alabdulmohsin
Ilya O. Tolstikhin
R. Baldock
Olivier Bousquet
Sylvain Gelly
Daniel Keysers
FedML
165
89
0
18 Jun 2020
Neural Anisotropy Directions
Neural Anisotropy Directions
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
94
16
0
17 Jun 2020
Dynamic Tensor Rematerialization
Dynamic Tensor Rematerialization
Marisa Kirisame
Steven Lyubomirsky
Altan Haan
Jennifer Brennan
Mike He
Jared Roesch
Tianqi Chen
Zachary Tatlock
92
94
0
17 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
92
218
0
16 Jun 2020
Surrogate gradients for analog neuromorphic computing
Surrogate gradients for analog neuromorphic computing
Benjamin Cramer
Sebastian Billaudelle
Simeon Kanya
Aron Leibfried
Andreas Grubl
...
Korbinian Schreiber
Yannik Stradmann
Johannes Weis
Johannes Schemmel
Friedemann Zenke
79
107
0
12 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSLVLM
167
437
0
11 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
226
1,720
0
08 Jun 2020
The Lipschitz Constant of Self-Attention
The Lipschitz Constant of Self-Attention
Hyunjik Kim
George Papamakarios
A. Mnih
90
146
0
08 Jun 2020
BERT Loses Patience: Fast and Robust Inference with Early Exit
BERT Loses Patience: Fast and Robust Inference with Early Exit
Wangchunshu Zhou
Canwen Xu
Tao Ge
Julian McAuley
Ke Xu
Furu Wei
60
343
0
07 Jun 2020
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings
  Contain a Distribution of Human-like Biases
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases
W. Guo
Aylin Caliskan
57
245
0
06 Jun 2020
Challenges and Thrills of Legal Arguments
Challenges and Thrills of Legal Arguments
Anurag Pallaprolu
Radha Vaidya
Aditya Swaroop Attawar
LRM
13
0
0
06 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
148
99
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
173
2,766
0
05 Jun 2020
CoCon: A Self-Supervised Approach for Controlled Text Generation
CoCon: A Self-Supervised Approach for Controlled Text Generation
Alvin Chan
Yew-Soon Ong
B. Pung
Aston Zhang
Jie Fu
79
86
0
05 Jun 2020
MLE-guided parameter search for task loss minimization in neural
  sequence modeling
MLE-guided parameter search for task loss minimization in neural sequence modeling
Sean Welleck
Kyunghyun Cho
59
10
0
04 Jun 2020
Serving DNNs like Clockwork: Performance Predictability from the Bottom
  Up
Serving DNNs like Clockwork: Performance Predictability from the Bottom Up
A. Gujarati
Reza Karimi
Safya Alzayat
Wei Hao
Antoine Kaufmann
Ymir Vigfusson
Jonathan Mace
109
285
0
03 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
91
75
0
31 May 2020
Transferring Inductive Biases through Knowledge Distillation
Transferring Inductive Biases through Knowledge Distillation
Samira Abnar
Mostafa Dehghani
Willem H. Zuidema
90
60
0
31 May 2020
Predict-then-Decide: A Predictive Approach for Wait or Answer Task in
  Dialogue Systems
Predict-then-Decide: A Predictive Approach for Wait or Answer Task in Dialogue Systems
Zehao Lin
Shaobo Cui
Guodun Li
Xiaoming Kang
Feng Ji
Feng-Lin Li
Zhongzhou Zhao
Haiqing Chen
Yin Zhang
60
2
0
27 May 2020
Med-BERT: pre-trained contextualized embeddings on large-scale
  structured electronic health records for disease prediction
Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction
L. Rasmy
Yang Xiang
Z. Xie
Cui Tao
Degui Zhi
AI4MHLM&MA
101
698
0
22 May 2020
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Movement Pruning: Adaptive Sparsity by Fine-Tuning
Victor Sanh
Thomas Wolf
Alexander M. Rush
79
487
0
15 May 2020
Tailoring and Evaluating the Wikipedia for in-Domain Comparable Corpora
  Extraction
Tailoring and Evaluating the Wikipedia for in-Domain Comparable Corpora Extraction
C. España-Bonet
Alberto Barrón-Cedeño
Lluís Marquez
22
9
0
03 May 2020
Reinforcement Learning with Augmented Data
Reinforcement Learning with Augmented Data
Michael Laskin
Kimin Lee
Adam Stooke
Lerrel Pinto
Pieter Abbeel
A. Srinivas
OffRL
122
660
0
30 Apr 2020
Explainable Deep Learning: A Field Guide for the Uninitiated
Explainable Deep Learning: A Field Guide for the Uninitiated
Gabrielle Ras
Ning Xie
Marcel van Gerven
Derek Doran
AAMLXAI
111
379
0
30 Apr 2020
Deep Learning for Time Series Forecasting: Tutorial and Literature
  Survey
Deep Learning for Time Series Forecasting: Tutorial and Literature Survey
Konstantinos Benidis
Syama Sundar Rangapuram
Valentin Flunkert
Bernie Wang
Danielle C. Maddix
...
David Salinas
Lorenzo Stella
François-Xavier Aubet
Laurent Callot
Tim Januschowski
AI4TS
99
200
0
21 Apr 2020
Experience Grounds Language
Experience Grounds Language
Yonatan Bisk
Ari Holtzman
Jesse Thomason
Jacob Andreas
Yoshua Bengio
...
Angeliki Lazaridou
Jonathan May
Aleksandr Nisnevich
Nicolas Pinto
Joseph P. Turian
102
360
0
21 Apr 2020
Improving Readability for Automatic Speech Recognition Transcription
Improving Readability for Automatic Speech Recognition Transcription
Junwei Liao
Sefik Emre Eskimez
Liyang Lu
Yu Shi
Ming Gong
Linjun Shou
Hong Qu
Michael Zeng
67
56
0
09 Apr 2020
Self-Induced Curriculum Learning in Self-Supervised Neural Machine
  Translation
Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation
Dana Ruiter
Josef van Genabith
C. España-Bonet
SSL
51
3
0
07 Apr 2020
Deep Learning Based Text Classification: A Comprehensive Review
Deep Learning Based Text Classification: A Comprehensive Review
Shervin Minaee
Nal Kalchbrenner
Min Zhang
Narjes Nikzad
M. Asgari-Chenaghlu
Jianfeng Gao
AILawVLMAI4TS
116
1,113
0
06 Apr 2020
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Chunyuan Li
Xiang Gao
Yuan Li
Baolin Peng
Xiujun Li
Yizhe Zhang
Jianfeng Gao
SSLDRL
86
182
0
05 Apr 2020
Previous
123...242243244
Next