Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,427 papers shown
Title
A Neural ODE Interpretation of Transformer Layers
Yaofeng Desmond Zhong
Tongtao Zhang
Amit Chakraborty
Biswadip Dey
139
10
0
12 Dec 2022
Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization
A. Jafari
I. Kobyzev
Mehdi Rezagholizadeh
Pascal Poupart
A. Ghodsi
VLM
75
5
0
12 Dec 2022
Federated Few-Shot Learning for Mobile NLP
Dongqi Cai
Shangguang Wang
Yaozong Wu
F. Lin
Mengwei Xu
FedML
93
12
0
12 Dec 2022
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Peng Lu
I. Kobyzev
Mehdi Rezagholizadeh
Ahmad Rashid
A. Ghodsi
Philippe Langlais
MoMe
104
11
0
12 Dec 2022
Collaborating Heterogeneous Natural Language Processing Tasks via Federated Learning
Chenhe Dong
Yuexiang Xie
Bolin Ding
Ying Shen
Yaliang Li
FedML
62
5
0
12 Dec 2022
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
Nicklas Hansen
Zhecheng Yuan
Yanjie Ze
Tongzhou Mu
Aravind Rajeswaran
H. Su
Huazhe Xu
Xiaolong Wang
89
66
0
12 Dec 2022
Implementing Deep Learning-Based Approaches for Article Summarization in Indian Languages
Rahul Tangsali
Aabha Pingle
Aditya Vyawahare
Isha Joshi
Raviraj Joshi
90
7
0
12 Dec 2022
A Study of Slang Representation Methods
Aravinda Kolla
Filip Ilievski
Hông-Ân Sandlin
Alain Mermoud
58
1
0
11 Dec 2022
Elixir: Train a Large Language Model on a Small GPU Cluster
Haichen Huang
Jiarui Fang
Hongxin Liu
Shenggui Li
Yang You
VLM
79
7
0
10 Dec 2022
MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base
Qi He
Xintao Wang
Jiaqing Liang
Yanghua Xiao
78
3
0
10 Dec 2022
Structured information extraction from complex scientific text with fine-tuned large language models
Alex Dunn
John Dagdelen
Nicholas Walker
Sanghoon Lee
Andrew S. Rosen
Gerbrand Ceder
Kristin A. Persson
Anubhav Jain
97
93
0
10 Dec 2022
LEAD: Liberal Feature-based Distillation for Dense Retrieval
Hao Sun
Xiao Liu
Yeyun Gong
Anlei Dong
Jing Lu
Yan Zhang
Linjun Yang
Rangan Majumder
Nan Duan
119
2
0
10 Dec 2022
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Ziniu Hu
Ahmet Iscen
Chen Sun
Zirui Wang
Kai-Wei Chang
Yizhou Sun
Cordelia Schmid
David A. Ross
Alireza Fathi
RALM
VLM
103
96
0
10 Dec 2022
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffM
VGen
121
248
0
10 Dec 2022
SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing
Chaoyang He
Shuai Zheng
Aston Zhang
George Karypis
Trishul Chilimbi
Mahdi Soltanolkotabi
Salman Avestimehr
MoE
50
1
0
10 Dec 2022
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Aran Komatsuzaki
J. Puigcerver
James Lee-Thorp
Carlos Riquelme Ruiz
Basil Mustafa
Joshua Ainslie
Yi Tay
Mostafa Dehghani
N. Houlsby
MoMe
MoE
106
124
0
09 Dec 2022
Audiovisual Masked Autoencoders
Mariana-Iuliana Georgescu
Eduardo Fonseca
Radu Tudor Ionescu
Mario Lucic
Cordelia Schmid
Anurag Arnab
SSL
118
45
0
09 Dec 2022
Spurious Features Everywhere -- Large-Scale Detection of Harmful Spurious Features in ImageNet
Yannic Neuhaus
Maximilian Augustin
Valentyn Boreiko
Matthias Hein
AAML
134
32
0
09 Dec 2022
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey
Yuxin Wang
Jieru Lin
Zhiwei Yu
Wei Hu
Börje F. Karlsson
134
20
0
09 Dec 2022
Training Data Influence Analysis and Estimation: A Survey
Zayd Hammoudeh
Daniel Lowd
TDI
119
101
0
09 Dec 2022
Structured Like a Language Model: Analysing AI as an Automated Subject
Liam Magee
Vanicka Arora
Luke Munn
16
21
0
08 Dec 2022
VideoDex: Learning Dexterity from Internet Videos
Kenneth Shaw
Shikhar Bahl
Deepak Pathak
101
96
0
08 Dec 2022
Learning Video Representations from Large Language Models
Yue Zhao
Ishan Misra
Philipp Krahenbuhl
Rohit Girdhar
VLM
AI4TS
118
177
0
08 Dec 2022
General-Purpose In-Context Learning by Meta-Learning Transformers
Louis Kirsch
James Harrison
Jascha Narain Sohl-Dickstein
Luke Metz
131
78
0
08 Dec 2022
Task Bias in Vision-Language Models
Sachit Menon
I. Chandratreya
Carl Vondrick
VLM
SSL
72
6
0
08 Dec 2022
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
Jinze Bai
Rui Men
Han Yang
Xuancheng Ren
Kai Dang
...
Wenhang Ge
Jianxin Ma
Junyang Lin
Jingren Zhou
Chang Zhou
88
16
0
08 Dec 2022
A Rubric for Human-like Agents and NeuroAI
Ida Momennejad
128
14
0
08 Dec 2022
Skellam Mixture Mechanism: a Novel Approach to Federated Learning with Differential Privacy
Ergute Bao
Yizheng Zhu
X. Xiao
Yifan Yang
Beng Chin Ooi
B. Tan
Khin Mi Mi Aung
FedML
82
19
0
08 Dec 2022
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
Björn Plüster
Jakob Ambsdorf
Lukas Braach
Jae Hee Lee
S. Wermter
76
6
0
08 Dec 2022
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding
Jianhao Yan
Jin Xu
Fandong Meng
Jie Zhou
Yue Zhang
107
4
0
08 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
92
11
0
08 Dec 2022
Group Generalized Mean Pooling for Vision Transformer
ByungSoo Ko
Han-Gyu Kim
Byeongho Heo
Sangdoo Yun
Sanghyuk Chun
Geonmo Gu
Wonjae Kim
ViT
88
1
0
08 Dec 2022
NP4G : Network Programming for Generalization
Shoichiro Hara
Yuji Watanabe
NAI
AI4CE
20
0
0
08 Dec 2022
Successive Prompting for Decomposing Complex Questions
Dheeru Dua
Shivanshu Gupta
Sameer Singh
Matt Gardner
ReLM
LRM
111
118
0
08 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
213
523
0
08 Dec 2022
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models
Chan Hee Song
Jiaman Wu
Clay Washington
Brian M Sadler
Wei-Lun Chao
Yu-Chuan Su
LLMAG
LM&Ro
173
425
0
08 Dec 2022
Demystifying Prompts in Language Models via Perplexity Estimation
Hila Gonen
Srini Iyer
Terra Blevins
Noah A. Smith
Luke Zettlemoyer
LRM
159
214
0
08 Dec 2022
Transfer learning for chemically accurate interatomic neural network potentials
Viktor Zaverkin
David Holzmüller
Luca Bonfirraro
Johannes Kastner
101
25
0
07 Dec 2022
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
163
386
0
07 Dec 2022
Robustness of Learning from Task Instructions
Jiasheng Gu
Hongyu Zhao
Hanzi Xu
Liang Nie
Hongyuan Mei
Wenpeng Yin
OOD
101
34
0
07 Dec 2022
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Kyuyong Shin
Hanock Kwak
Wonjae Kim
Jisu Jeong
Seungjae Jung
KyungHyun Kim
Jung-Woo Ha
Sang-Woo Lee
84
4
0
07 Dec 2022
Memorization of Named Entities in Fine-tuned BERT Models
Andor Diera
N. Lell
Aygul Garifullina
A. Scherp
68
0
0
07 Dec 2022
Harnessing Knowledge and Reasoning for Human-Like Natural Language Generation: A Brief Review
Jiangjie Chen
Yanghua Xiao
116
5
0
07 Dec 2022
The problem with AI consciousness: A neurogenetic case against synthetic sentience
Yoshija Walter
L. Zbinden
44
1
0
07 Dec 2022
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
Conglong Li
Z. Yao
Xiaoxia Wu
Minjia Zhang
Connor Holmes
Cheng Li
Yuxiong He
69
25
0
07 Dec 2022
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
Ziqi Zhou
Bowen Zhang
Yinjie Lei
Lingqiao Liu
Yifan Liu
VLM
97
176
0
07 Dec 2022
Talking About Large Language Models
Murray Shanahan
AI4CE
132
275
0
07 Dec 2022
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
263
624
0
07 Dec 2022
"It would work for me too": How Online Communities Shape Software Developers' Trust in AI-Powered Code Generation Tools
Ruijia Cheng
Ruotong Wang
Thomas Zimmermann
Denae Ford
109
33
0
07 Dec 2022
SimVTP: Simple Video Text Pre-training with Masked Autoencoders
Yue Ma
Tianyu Yang
Yin Shan
Xiu Li
88
27
0
07 Dec 2022
Previous
1
2
3
...
172
173
174
...
247
248
249
Next