ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,277 papers shown
Title
Training Large-Scale News Recommenders with Pretrained Language Models
  in the Loop
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Shitao Xiao
Zheng Liu
Yingxia Shao
Tao Di
Xing Xie
VLMAIFin
186
42
0
18 Feb 2021
Composable Generative Models
Composable Generative Models
Johan Leduc
Nicolas Grislain
SyDa
84
4
0
18 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
461
1,143
0
17 Feb 2021
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure
  Dataset Release
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release
Liam H. Fowl
Ping Yeh-Chiang
Micah Goldblum
Jonas Geiping
Arpit Bansal
W. Czaja
Tom Goldstein
80
43
0
16 Feb 2021
Accelerated Sparse Neural Training: A Provable and Efficient Method to
  Find N:M Transposable Masks
Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
Itay Hubara
Brian Chmiel
Moshe Island
Ron Banner
S. Naor
Daniel Soudry
121
119
0
16 Feb 2021
GradInit: Learning to Initialize Neural Networks for Stable and
  Efficient Training
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training
Chen Zhu
Renkun Ni
Zheng Xu
Kezhi Kong
Wenjie Huang
Tom Goldstein
ODL
119
56
0
16 Feb 2021
Exploring Transformers in Natural Language Generation: GPT, BERT, and
  XLNet
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. O. Topal
Anil Bas
Imke van Heerden
LLMAGAI4CE
73
91
0
16 Feb 2021
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale
  Language Models
TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models
Zhuohan Li
Siyuan Zhuang
Shiyuan Guo
Danyang Zhuo
Hao Zhang
Basel Alomair
Ion Stoica
MoE
95
125
0
16 Feb 2021
Training Larger Networks for Deep Reinforcement Learning
Training Larger Networks for Deep Reinforcement Learning
Keita Ota
Devesh K. Jha
Asako Kanezaki
OffRL
97
40
0
16 Feb 2021
Translational Equivariance in Kernelizable Attention
Translational Equivariance in Kernelizable Attention
Max Horn
Kumar Shridhar
Elrich Groenewald
Philipp F. M. Baumann
54
7
0
15 Feb 2021
Prompt Programming for Large Language Models: Beyond the Few-Shot
  Paradigm
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Laria Reynolds
Kyle McDonell
122
928
0
15 Feb 2021
Understanding Negative Samples in Instance Discriminative
  Self-supervised Representation Learning
Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
Kento Nozawa
Issei Sato
SSL
132
46
0
13 Feb 2021
Explaining Neural Scaling Laws
Explaining Neural Scaling Laws
Yasaman Bahri
Ethan Dyer
Jared Kaplan
Jaehoon Lee
Utkarsh Sharma
91
270
0
12 Feb 2021
InsNet: An Efficient, Flexible, and Performant Insertion-based Text
  Generation Model
InsNet: An Efficient, Flexible, and Performant Insertion-based Text Generation Model
Sidi Lu
Tao Meng
Nanyun Peng
104
13
0
12 Feb 2021
Contrastive Unsupervised Learning for Speech Emotion Recognition
Contrastive Unsupervised Learning for Speech Emotion Recognition
Mao Li
Bo Yang
Joshua Levy
A. Stolcke
Viktor Rozgic
Spyros Matsoukas
C. Papayiannis
Daniel Bone
Chao Wang
SSL
100
49
0
12 Feb 2021
Proof Artifact Co-training for Theorem Proving with Language Models
Proof Artifact Co-training for Theorem Proving with Language Models
Jesse Michael Han
Jason M. Rute
Yuhuai Wu
Edward W. Ayers
Stanislas Polu
AIMat
107
127
0
11 Feb 2021
Cross-Domain Multi-Task Learning for Sequential Sentence Classification
  in Research Papers
Cross-Domain Multi-Task Learning for Sequential Sentence Classification in Research Papers
Arthur Brack
Anett Hoppe
Pascal Buschermöhle
Ralph Ewerth
104
18
0
11 Feb 2021
Representation Matters: Offline Pretraining for Sequential Decision
  Making
Representation Matters: Offline Pretraining for Sequential Decision Making
Mengjiao Yang
Ofir Nachum
SSLOffRL
93
119
0
11 Feb 2021
Energy-Harvesting Distributed Machine Learning
Energy-Harvesting Distributed Machine Learning
Başak Güler
Aylin Yener
FedML
60
15
0
10 Feb 2021
Generating Synthetic Text Data to Evaluate Causal Inference Methods
Generating Synthetic Text Data to Evaluate Causal Inference Methods
Zach Wood-Doughty
I. Shpitser
Mark Dredze
SyDaCML
68
11
0
10 Feb 2021
NAST: Non-Autoregressive Spatial-Temporal Transformer for Time Series Forecasting
Kai Chen
Guang Chen
Dan Xu
Lijun Zhang
Yuyao Huang
Alois C. Knoll
AI4TS
48
22
0
10 Feb 2021
Language Models for Lexical Inference in Context
Language Models for Lexical Inference in Context
Martin Schmitt
Hinrich Schütze
78
14
0
10 Feb 2021
Dynamic Neural Networks: A Survey
Dynamic Neural Networks: A Survey
Yizeng Han
Gao Huang
Shiji Song
Le Yang
Honghui Wang
Yulin Wang
3DHAI4TSAI4CE
130
658
0
09 Feb 2021
Improving Scene Graph Classification by Exploiting Knowledge from Texts
Improving Scene Graph Classification by Exploiting Knowledge from Texts
Sahand Sharifzadeh
Sina Moayed Baharlou
Martin Schmitt
Hinrich Schutze
Volker Tresp
52
19
0
09 Feb 2021
A Framework for Auditing Data Center Energy Usage and Mitigating
  Environmental Footprint
A Framework for Auditing Data Center Energy Usage and Mitigating Environmental Footprint
Justin Gould
24
1
0
08 Feb 2021
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
Generating Fake Cyber Threat Intelligence Using Transformer-Based Models
P. Ranade
Aritran Piplai
Sudip Mittal
A. Joshi
Tim Finin
110
71
0
08 Feb 2021
The Singleton Fallacy: Why Current Critiques of Language Models Miss the
  Point
The Singleton Fallacy: Why Current Critiques of Language Models Miss the Point
Magnus Sahlgren
F. Carlsson
55
28
0
08 Feb 2021
Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds
Points2Vec: Unsupervised Object-level Feature Learning from Point Clouds
Joel Bachmann
Kenneth Blomqvist
Julian Förster
Roland Siegwart
3DPC
52
3
0
08 Feb 2021
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch
Aojun Zhou
Yukun Ma
Junnan Zhu
Jianbo Liu
Zhijie Zhang
Kun Yuan
Wenxiu Sun
Hongsheng Li
215
250
0
08 Feb 2021
Nyströmformer: A Nyström-Based Algorithm for Approximating
  Self-Attention
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention
Yunyang Xiong
Zhanpeng Zeng
Rudrasis Chakraborty
Mingxing Tan
G. Fung
Yin Li
Vikas Singh
106
526
0
07 Feb 2021
Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word
  Understanding of Language Models
Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word Understanding of Language Models
Lutfi Kerem Senel
Hinrich Schütze
45
5
0
06 Feb 2021
Ownership Verification of DNN Architectures via Hardware Cache Side
  Channels
Ownership Verification of DNN Architectures via Hardware Cache Side Channels
Xiaoxuan Lou
Shangwei Guo
Jiwei Li
Tianwei Zhang
68
11
0
06 Feb 2021
Symbolic Behaviour in Artificial Intelligence
Symbolic Behaviour in Artificial Intelligence
Adam Santoro
Andrew Kyle Lampinen
Kory W. Mathewson
Timothy Lillicrap
David Raposo
79
34
0
05 Feb 2021
PipeTransformer: Automated Elastic Pipelining for Distributed Training
  of Transformers
PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers
Chaoyang He
Shen Li
Mahdi Soltanolkotabi
Salman Avestimehr
65
29
0
05 Feb 2021
DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep
  Learning
DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep Learning
Kelly Kostopoulou
Hang Xu
Aritra Dutta
Xin Li
A. Ntoulas
Panos Kalnis
41
7
0
05 Feb 2021
Understanding Emails and Drafting Responses -- An Approach Using GPT-3
Understanding Emails and Drafting Responses -- An Approach Using GPT-3
Jonas Thiergart
Stefan Huber
Thomas Übellacker
38
25
0
05 Feb 2021
Adaptive Semiparametric Language Models
Adaptive Semiparametric Language Models
Dani Yogatama
Cyprien de Masson dÁutume
Lingpeng Kong
KELMRALM
99
100
0
04 Feb 2021
Embodied Intelligence via Learning and Evolution
Embodied Intelligence via Learning and Evolution
Agrim Gupta
Silvio Savarese
Surya Ganguli
Li Fei-Fei
AI4CE
102
254
0
03 Feb 2021
When Can Models Learn From Explanations? A Formal Framework for
  Understanding the Roles of Explanation Data
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data
Peter Hase
Joey Tianyi Zhou
XAI
129
89
0
03 Feb 2021
Fairness for Unobserved Characteristics: Insights from Technological
  Impacts on Queer Communities
Fairness for Unobserved Characteristics: Insights from Technological Impacts on Queer Communities
Nenad Tomašev
Kevin R. McKee
Jackie Kay
Shakir Mohamed
FaML
85
89
0
03 Feb 2021
Mind the Gap: Assessing Temporal Generalization in Neural Language
  Models
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
...
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
VLM
140
219
0
03 Feb 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using
  Divergence Frontiers
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
151
364
0
02 Feb 2021
Scaling Laws for Transfer
Scaling Laws for Transfer
Danny Hernandez
Jared Kaplan
T. Henighan
Sam McCandlish
97
251
0
02 Feb 2021
Generative Spoken Language Modeling from Raw Audio
Generative Spoken Language Modeling from Raw Audio
Kushal Lakhotia
Evgeny Kharitonov
Wei-Ning Hsu
Yossi Adi
Adam Polyak
...
Tu Nguyen
Jade Copet
Alexei Baevski
A. Mohamed
Emmanuel Dupoux
AuLLM
290
366
0
01 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
335
371
0
01 Feb 2021
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained
  Language Models
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
Nora Kassner
Philipp Dufter
Hinrich Schütze
78
141
0
01 Feb 2021
Neural Network architectures to classify emotions in Indian Classical
  Music
Neural Network architectures to classify emotions in Indian Classical Music
U. Sarkar
Sayan Nag
Medha Basu
Archi Banerjee
S. Sanyal
R. Sengupta
D. Ghosh
16
1
0
01 Feb 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
148
117
0
31 Jan 2021
Can We Automate Scientific Reviewing?
Can We Automate Scientific Reviewing?
Weizhe Yuan
Pengfei Liu
Graham Neubig
156
90
0
30 Jan 2021
Challenges in Automated Debiasing for Toxic Language Detection
Challenges in Automated Debiasing for Toxic Language Detection
Xuhui Zhou
Maarten Sap
Swabha Swayamdipta
Noah A. Smith
Yejin Choi
78
142
0
29 Jan 2021
Previous
123...236237238...244245246
Next