Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,277 papers shown
Title
Natural Language Sentence Generation from API Specifications
Siyu Huo
K. Mukherjee
Jayachandu Bandlamudi
Vatche Isahagian
Vinod Muthusamy
Sadhana Kumaravel
59
2
0
01 Jun 2022
Dynaformer: A Deep Learning Model for Ageing-aware Battery Discharge Prediction
Luca Biggio
Tommaso Bendinelli
Chetan S. Kulkarni
Olga Fink
61
11
0
01 Jun 2022
Romantic-Computing
Elizabeth Horishny
79
0
0
01 Jun 2022
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder
Jie Shi
Chenfei Wu
Jian Liang
Xiang Liu
Nan Duan
DiffM
76
26
0
01 Jun 2022
Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Tianyu Chen
Shaohan Huang
Yuan Xie
Binxing Jiao
Daxin Jiang
Haoyi Zhou
Jianxin Li
Furu Wei
MoE
88
42
0
01 Jun 2022
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption
Tianyu Chen
Hangbo Bao
Shaohan Huang
Li Dong
Binxing Jiao
Daxin Jiang
Haoyi Zhou
Jianxin Li
Furu Wei
98
107
0
01 Jun 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Yiding Chen
Xuezhou Zhang
Kai Zhang
Mengdi Wang
Xiaojin Zhu
OffRL
133
18
0
01 Jun 2022
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Jun Chen
Ming Hu
Boyang Albert Li
Mohamed Elhoseiny
142
37
0
01 Jun 2022
Pre-training via Denoising for Molecular Property Prediction
Sheheryar Zaidi
Michael Schaarschmidt
James Martens
Hyunjik Kim
Yee Whye Teh
Alvaro Sanchez-Gonzalez
Peter W. Battaglia
Razvan Pascanu
Jonathan Godwin
DiffM
AI4CE
129
127
0
31 May 2022
Asynchronous Hierarchical Federated Learning
Xing Wang
Yijun Wang
FedML
73
6
0
31 May 2022
Neural Retriever and Go Beyond: A Thesis Proposal
Man Luo
100
1
0
31 May 2022
On the Usefulness of Embeddings, Clusters and Strings for Text Generator Evaluation
Tiago Pimentel
Clara Meister
Ryan Cotterell
117
7
0
31 May 2022
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments
Keiran Paster
Sheila A. McIlraith
Jimmy Ba
OffRL
262
28
0
31 May 2022
Knowledge Graph - Deep Learning: A Case Study in Question Answering in Aviation Safety Domain
Ankush Agarwal
Raju Gite
Shreyash Laddha
P. Bhattacharyya
Satyanarayan Kar
Asif Ekbal
Prabhjit Thind
Rajesh Zele
Ravi Shankar
122
14
0
31 May 2022
VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting
Kashif Rasul
Young-Jin Park
Max Nihlén Ramström
KyungHyun Kim
BDL
AI4TS
38
4
0
31 May 2022
Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking
Young-Ho Kim
Sungdong Kim
Minsuk Chang
Sang-Woo Lee
87
5
0
31 May 2022
FinBERT-MRC: financial named entity recognition using BERT under the machine reading comprehension paradigm
Yuzhe Zhang
Hong Zhang
85
28
0
31 May 2022
Few-Shot Diffusion Models
Giorgio Giannone
Didrik Nielsen
Ole Winther
DiffM
231
51
0
30 May 2022
Attention Flows for General Transformers
Niklas Metzger
Christopher Hahn
Julian Siber
Frederik Schmitt
Bernd Finkbeiner
67
0
0
30 May 2022
Zero-Shot and Few-Shot Learning for Lung Cancer Multi-Label Classification using Vision Transformer
F. Guo
Yingfang Fan
ViT
MedIm
124
7
0
30 May 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGe
VLM
97
13
0
30 May 2022
A Survey in Mathematical Language Processing
Jordan Meadows
André Freitas
AIMat
63
16
0
30 May 2022
Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models
Mengzhou Xia
Mikel Artetxe
Jingfei Du
Danqi Chen
Ves Stoyanov
49
6
0
30 May 2022
Automatic Short Math Answer Grading via In-context Meta-learning
Mengxue Zhang
Sami Baral
Neil T. Heffernan
Andrew Lan
AI4Ed
AIMat
69
25
0
30 May 2022
Flowification: Everything is a Normalizing Flow
Bálint Máté
Samuel Klein
T. Golling
Franccois Fleuret
64
3
0
30 May 2022
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training
Lu Yin
Vlado Menkovski
Meng Fang
Tianjin Huang
Yulong Pei
Mykola Pechenizkiy
Decebal Constantin Mocanu
Shiwei Liu
108
8
0
30 May 2022
Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task
G. Rosa
L. Bonifacio
Vitor Jeronymo
Hugo Queiroz Abonizio
R. Lotufo
Rodrigo Nogueira
AILaw
ELM
100
11
0
30 May 2022
A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law
Chen Li
Antonios Tsourdos
Weisi Guo
AI4CE
59
3
0
30 May 2022
Dataset Condensation via Efficient Synthetic-Data Parameterization
Jang-Hyun Kim
Jinuk Kim
Seong Joon Oh
Sangdoo Yun
Hwanjun Song
Joonhyun Jeong
Jung-Woo Ha
Hyun Oh Song
DD
518
168
0
30 May 2022
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Muning Wen
J. Kuba
Runji Lin
Weinan Zhang
Ying Wen
Jun Wang
Yaodong Yang
103
192
0
30 May 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
105
27
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
200
18
0
30 May 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
322
632
0
29 May 2022
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning
Xiang Chen
Lei Li
Ningyu Zhang
Xiaozhuan Liang
Shumin Deng
Chuanqi Tan
Fei Huang
Luo Si
Huajun Chen
VLM
153
55
0
29 May 2022
The impact of memory on learning sequence-to-sequence tasks
Alireza Seif
S. Loos
Gennaro Tucci
É. Roldán
Sebastian Goldt
70
5
0
29 May 2022
COFS: Controllable Furniture layout Synthesis
W. Para
Paul Guerrero
Niloy Mitra
Peter Wonka
3DV
88
17
0
29 May 2022
Improving VAE-based Representation Learning
Mingtian Zhang
Tim Z. Xiao
Brooks Paige
David Barber
SSL
DRL
70
10
0
28 May 2022
All That's Happening behind the Scenes: Putting the Spotlight on Volunteer Moderator Labor in Reddit
Hanlin Li
Brent J. Hecht
Stevie Chancellor
57
42
0
28 May 2022
Parameter-Efficient and Student-Friendly Knowledge Distillation
Jun Rao
Xv Meng
Liang Ding
Shuhan Qi
Dacheng Tao
93
51
0
28 May 2022
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Renrui Zhang
Ziyu Guo
Rongyao Fang
Bingyan Zhao
Dong Wang
Yu Qiao
Hongsheng Li
Peng Gao
3DPC
258
261
0
28 May 2022
Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
R. Liu
Young Jin Kim
Alexandre Muzio
Hany Awadalla
MoE
79
22
0
28 May 2022
Teaching Models to Express Their Uncertainty in Words
Stephanie C. Lin
Jacob Hilton
Owain Evans
OOD
107
425
0
28 May 2022
Few-shot Subgoal Planning with Language Models
Lajanugen Logeswaran
Yao Fu
Moontae Lee
Honglak Lee
LRM
76
26
0
28 May 2022
Controllable Text Generation with Neurally-Decomposed Oracle
Tao Meng
Sidi Lu
Nanyun Peng
Kai-Wei Chang
BDL
103
37
0
27 May 2022
Diffusion-LM Improves Controllable Text Generation
Xiang Lisa Li
John Thickstun
Ishaan Gulrajani
Percy Liang
Tatsunori B. Hashimoto
AI4CE
248
837
0
27 May 2022
Multimodal Masked Autoencoders Learn Transferable Representations
Xinyang Geng
Hao Liu
Lisa Lee
Dale Schuurams
Sergey Levine
Pieter Abbeel
88
119
0
27 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
276
2,297
0
27 May 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
172
562
0
27 May 2022
AANG: Automating Auxiliary Learning
Lucio Dery
Paul Michel
M. Khodak
Graham Neubig
Ameet Talwalkar
107
9
0
27 May 2022
Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos
Gautam Singh
Yi-Fu Wu
Sungjin Ahn
OCL
147
121
0
27 May 2022
Previous
1
2
3
...
195
196
197
...
244
245
246
Next