ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,288 papers shown
Title
TFPose: Direct Human Pose Estimation with Transformers
TFPose: Direct Human Pose Estimation with Transformers
Wei Mao
Yongtao Ge
Chunhua Shen
Zhi Tian
Xinlong Wang
Zhibin Wang
ViT
98
89
0
29 Mar 2021
Whitening Sentence Representations for Better Semantics and Faster
  Retrieval
Whitening Sentence Representations for Better Semantics and Faster Retrieval
Jianlin Su
Jiarun Cao
Weijie Liu
Yangyiwen Ou
70
305
0
29 Mar 2021
"Weak AI" is Likely to Never Become "Strong AI", So What is its Greatest
  Value for us?
"Weak AI" is Likely to Never Become "Strong AI", So What is its Greatest Value for us?
B. Liu
39
9
0
29 Mar 2021
Self-supervised Graph Neural Networks without explicit negative sampling
Self-supervised Graph Neural Networks without explicit negative sampling
Zekarias T. Kefato
Sarunas Girdzijauskas
SSL
102
44
0
27 Mar 2021
Machine Learning Meets Natural Language Processing -- The story so far
Machine Learning Meets Natural Language Processing -- The story so far
N. Galanis
P. Vafiadis
K.-G. Mirzaev
G. Papakostas
82
7
0
27 Mar 2021
Alignment of Language Agents
Alignment of Language Agents
Zachary Kenton
Tom Everitt
Laura Weidinger
Iason Gabriel
Vladimir Mikulik
G. Irving
85
166
0
26 Mar 2021
A Practical Survey on Faster and Lighter Transformers
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
137
103
0
26 Mar 2021
Understanding Robustness of Transformers for Image Classification
Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli
Ayan Chakrabarti
Daniel Glasner
Daliang Li
Thomas Unterthiner
Andreas Veit
ViT
113
391
0
26 Mar 2021
Data Augmentation in Natural Language Processing: A Novel Text
  Generation Approach for Long and Short Text Classifiers
Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers
Markus Bayer
M. Kaufhold
Björn Buchhold
Marcel Keller
J. Dallmeyer
Christian A. Reuter
94
121
0
26 Mar 2021
Vision Transformers for Dense Prediction
Vision Transformers for Dense Prediction
René Ranftl
Alexey Bochkovskiy
V. Koltun
ViTMDE
143
1,751
0
24 Mar 2021
FastMoE: A Fast Mixture-of-Expert Training System
FastMoE: A Fast Mixture-of-Expert Training System
Jiaao He
J. Qiu
Aohan Zeng
Zhilin Yang
Jidong Zhai
Jie Tang
ALMMoE
109
104
0
24 Mar 2021
Representing Numbers in NLP: a Survey and a Vision
Representing Numbers in NLP: a Survey and a Vision
Avijit Thawani
Jay Pujara
Pedro A. Szekely
Filip Ilievski
91
119
0
24 Mar 2021
Finetuning Pretrained Transformers into RNNs
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
107
67
0
24 Mar 2021
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning
  Performance of GPT-2
Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2
Gregor Betz
Kyle Richardson
Christian Voigt
ReLMLRM
87
31
0
24 Mar 2021
Can Vision Transformers Learn without Natural Images?
Can Vision Transformers Learn without Natural Images?
Kodai Nakashima
Hirokatsu Kataoka
Asato Matsumoto
K. Iwata
Nakamasa Inoue
ViT
57
34
0
24 Mar 2021
Multi-view 3D Reconstruction with Transformer
Multi-view 3D Reconstruction with Transformer
Dan Wang
Xinrui Cui
Xun Chen
Zhengxia Zou
Tianyang Shi
Septimiu Salcudean
Z. J. Wang
Rabab Ward
ViT
79
90
0
24 Mar 2021
NaturalProofs: Mathematical Theorem Proving in Natural Language
NaturalProofs: Mathematical Theorem Proving in Natural Language
Sean Welleck
Jiacheng Liu
Ronan Le Bras
Hannaneh Hajishirzi
Yejin Choi
Kyunghyun Cho
AIMat
91
69
0
24 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
  Architectures
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
106
95
0
23 Mar 2021
How to decay your learning rate
How to decay your learning rate
Aitor Lewkowycz
114
24
0
23 Mar 2021
Are Neural Language Models Good Plagiarists? A Benchmark for Neural
  Paraphrase Detection
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection
Jan Philip Wahle
Terry Ruas
Norman Meuschke
Bela Gipp
115
34
0
23 Mar 2021
Detecting Hate Speech with GPT-3
Detecting Hate Speech with GPT-3
Ke-Li Chiu
Annie Collins
Rohan Alexander
AILaw
102
114
0
23 Mar 2021
Tiny Transformers for Environmental Sound Classification at the Edge
Tiny Transformers for Environmental Sound Classification at the Edge
David Elliott
Carlos E. Otero
Steven Wyatt
Evan Martino
81
16
0
22 Mar 2021
End-to-End Trainable Multi-Instance Pose Estimation with Transformers
End-to-End Trainable Multi-Instance Pose Estimation with Transformers
Lucas Stoffl
Maxime Vidal
Alexander Mathis
ViT
72
52
0
22 Mar 2021
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
Julia Kreutzer
Isaac Caswell
Lisa Wang
Ahsan Wahab
D. Esch
...
Duygu Ataman
Orevaoghene Ahia
Oghenefego Ahia
Sweta Agrawal
Mofetoluwa Adeyemi
62
279
0
22 Mar 2021
Improving and Simplifying Pattern Exploiting Training
Improving and Simplifying Pattern Exploiting Training
Derek Tam
Rakesh R Menon
Joey Tianyi Zhou
Shashank Srivastava
Colin Raffel
78
151
0
22 Mar 2021
DeepViT: Towards Deeper Vision Transformer
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
119
525
0
22 Mar 2021
Lawyers are Dishonest? Quantifying Representational Harms in Commonsense
  Knowledge Resources
Lawyers are Dishonest? Quantifying Representational Harms in Commonsense Knowledge Resources
Ninareh Mehrabi
Pei Zhou
Fred Morstatter
Jay Pujara
Xiang Ren
Aram Galstyan
AILaw
124
45
0
21 Mar 2021
Attribute Alignment: Controlling Text Generation from Pre-trained
  Language Models
Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
Dian Yu
Zhou Yu
Kenji Sagae
82
39
0
20 Mar 2021
Paint by Word
Paint by Word
A. Andonian
David Bau
Audrey Cui
YeonHwan Park
Ali Jahanian
Antonio Torralba
A. Oliva
DiffM
96
125
0
19 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
178
1,184
0
18 Mar 2021
GLM: General Language Model Pretraining with Autoregressive Blank
  Infilling
GLM: General Language Model Pretraining with Autoregressive Blank Infilling
Zhengxiao Du
Yujie Qian
Xiao Liu
Ming Ding
J. Qiu
Zhilin Yang
Jie Tang
BDLAI4CE
156
1,565
0
18 Mar 2021
Structure Inducing Pre-Training
Structure Inducing Pre-Training
Matthew B. A. McDermott
Brendan Yap
Peter Szolovits
Marinka Zitnik
85
21
0
18 Mar 2021
Set-to-Sequence Methods in Machine Learning: a Review
Set-to-Sequence Methods in Machine Learning: a Review
Mateusz Jurewicz
Leon Derczynski
BDL
63
10
0
17 Mar 2021
Towards Few-Shot Fact-Checking via Perplexity
Towards Few-Shot Fact-Checking via Perplexity
Nayeon Lee
Yejin Bang
Andrea Madotto
Madian Khabsa
Pascale Fung
AAML
50
93
0
17 Mar 2021
OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs
OGB-LSC: A Large-Scale Challenge for Machine Learning on Graphs
Weihua Hu
Matthias Fey
Hongyu Ren
Maho Nakata
Yuxiao Dong
J. Leskovec
AI4CE
106
415
0
17 Mar 2021
Learning without gradient descent encoded by the dynamics of a
  neurobiological model
Learning without gradient descent encoded by the dynamics of a neurobiological model
V. George
V. Morar
Weiwei Yang
Jonathan Larson
B. Tower
Shweti Mahajan
Arkin Gupta
Christopher M. White
Gabriel A. Silva
26
1
0
16 Mar 2021
Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence
Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence
Tal Schuster
Adam Fisch
Regina Barzilay
114
239
0
15 Mar 2021
How Many Data Points is a Prompt Worth?
How Many Data Points is a Prompt Worth?
Teven Le Scao
Alexander M. Rush
VLM
196
303
0
15 Mar 2021
A Whole Brain Probabilistic Generative Model: Toward Realizing Cognitive
  Architectures for Developmental Robots
A Whole Brain Probabilistic Generative Model: Toward Realizing Cognitive Architectures for Developmental Robots
T. Taniguchi
Hiroshi Yamakawa
Takayuki Nagai
Kenji Doya
M. Sakagami
Masahiro Suzuki
Tomoaki Nakamura
Akira Taniguchi
79
24
0
15 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Nayeon Lee
Jonathon Shlens
Barret Zoph
98
302
0
13 Mar 2021
Inductive Relation Prediction by BERT
Inductive Relation Prediction by BERT
H. Zha
Zhiyu Zoey Chen
Xifeng Yan
146
58
0
12 Mar 2021
Towards Socially Intelligent Agents with Mental State Transition and
  Human Utility
Towards Socially Intelligent Agents with Mental State Transition and Human Utility
Liang Qiu
Yizhou Zhao
Yuan Liang
Pan Lu
Weiyan Shi
Zhou Yu
Song-Chun Zhu
LLMAG
91
15
0
12 Mar 2021
On Improving Deep Learning Trace Analysis with System Call Arguments
On Improving Deep Learning Trace Analysis with System Call Arguments
Quentin Fournier
Daniel Aloise
S. V. Azhari
François Tetreault
53
10
0
11 Mar 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language
  Representation
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
117
224
0
11 Mar 2021
Integration of Convolutional Neural Networks in Mobile Applications
Integration of Convolutional Neural Networks in Mobile Applications
Roger Creus Castanyer
Silverio Martínez-Fernández
Xavier Franch
59
12
0
11 Mar 2021
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio
  Representation
BYOL for Audio: Self-Supervised Learning for General-Purpose Audio Representation
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
Noboru Harada
K. Kashino
SSL
100
179
0
11 Mar 2021
Topical Language Generation using Transformers
Topical Language Generation using Transformers
Rohola Zandie
Mohammad H. Mahoor
BDL
29
5
0
11 Mar 2021
Hurdles to Progress in Long-form Question Answering
Hurdles to Progress in Long-form Question Answering
Kalpesh Krishna
Aurko Roy
Mohit Iyyer
72
200
0
10 Mar 2021
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
CUAD: An Expert-Annotated NLP Dataset for Legal Contract Review
Dan Hendrycks
Collin Burns
Anya Chen
Spencer Ball
ELMAILaw
69
195
0
10 Mar 2021
Variable-rate discrete representation learning
Variable-rate discrete representation learning
Sander Dieleman
C. Nash
Jesse Engel
Karen Simonyan
BDLDRL
82
24
0
10 Mar 2021
Previous
123...234235236...244245246
Next