Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,343 papers shown
Title
General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation
Rui Meng
Tong Wang
Xingdi Yuan
Yingbo Zhou
Daqing He
70
6
0
20 Aug 2022
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
Arpit Bansal
Eitan Borgnia
Hong-Min Chu
Jie S. Li
Hamid Kazemi
Furong Huang
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLM
DiffM
82
286
0
19 Aug 2022
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
Gati Aher
RosaI. Arriaga
Adam Tauman Kalai
174
405
0
18 Aug 2022
MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue Generation
Yongkang Liu
Shi Feng
Daling Wang
Yifei Zhang
66
8
0
18 Aug 2022
A Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tables
Adrien Benamira
Tristan Guérand
Thomas Peyrin
Trevor Yap
Bryan Hooi
69
2
0
18 Aug 2022
See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Xiujun Shu
Wei Wen
Haoqian Wu
Keyun Chen
Yi-Zhe Song
Ruizhi Qiao
Bohan Ren
Xiao Wang
91
99
0
18 Aug 2022
Understanding Scaling Laws for Recommendation Models
Newsha Ardalani
Carole-Jean Wu
Zeliang Chen
Bhargav Bhushanam
Adnan Aziz
93
31
0
17 Aug 2022
HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models
Swaroop Mishra
E. Nouri
LRM
125
27
0
17 Aug 2022
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
...
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
ELM
123
93
0
17 Aug 2022
Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
VLM
88
53
0
17 Aug 2022
Boosting Distributed Training Performance of the Unpadded BERT Model
Jinle Zeng
Min Li
Zhihua Wu
Jiaqi Liu
Yuang Liu
Dianhai Yu
Yanjun Ma
67
11
0
17 Aug 2022
Constrained Few-Shot Learning: Human-Like Low Sample Complexity Learning and Non-Episodic Text Classification
Jaron Mar
Jiamou Liu
84
2
0
17 Aug 2022
EGCR: Explanation Generation for Conversational Recommendation
Bingbing Wen
Xiaoning Bu
Chirag Shah
54
2
0
17 Aug 2022
Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise
Yihao Xue
Kyle Whitecross
Baharan Mirzasoleiman
NoLa
77
1
0
17 Aug 2022
What Artificial Neural Networks Can Tell Us About Human Language Acquisition
Alex Warstadt
Samuel R. Bowman
80
120
0
17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
Hendrik Strobelt
Albert Webson
Victor Sanh
Benjamin Hoover
Johanna Beyer
Hanspeter Pfister
Alexander M. Rush
VLM
78
141
0
16 Aug 2022
Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets
Hao Chen
R. Tao
Han Zhang
Yidong Wang
Xiang Li
Weirong Ye
Jindong Wang
Guosheng Hu
Marios Savvides
VPVLM
129
57
0
15 Aug 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
147
666
0
15 Aug 2022
Deception for Cyber Defence: Challenges and Opportunities
David Liebowitz
Surya Nepal
Kristen Moore
Cody James Christopher
S. Kanhere
David D. Nguyen
Roelien C. Timmer
Michael Longland
Keerth Rathakumar
70
10
0
15 Aug 2022
Grasping Core Rules of Time Series through Pure Models
Gedi Liu
Yifeng Jiang
Yicun Ouyang
Keyang Zhong
Yang Wang
AI4TS
88
0
0
15 Aug 2022
Explainable Artificial Intelligence for Assault Sentence Prediction in New Zealand
Harry Rodger
Andrew Lensen
Marcin Betkier
16
6
0
15 Aug 2022
Targeted Honeyword Generation with Language Models
Fang Yu
Miguel Vargas Martin
105
5
0
15 Aug 2022
Limits of an AI program for solving college math problems
E. Davis
AIMat
44
3
0
14 Aug 2022
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViT
MedIm
53
2
0
11 Aug 2022
Interactive Code Generation via Test-Driven User-Intent Formalization
Shuvendu K. Lahiri
Sarah Fakhoury
Aaditya Naik
Georgios Sakkas
Saikat Chakraborty
...
Piali Choudhury
Curtis von Veh
J. Inala
Chenglong Wang
Jianfeng Gao
95
64
0
11 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
80
3
0
11 Aug 2022
On the Pros and Cons of Momentum Encoder in Self-Supervised Visual Representation Learning
T. Pham
Chaoning Zhang
Axi Niu
Kang Zhang
Chang D. Yoo
78
11
0
11 Aug 2022
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment
Jie Zhu
Leye Wang
Xiao Han
87
10
0
11 Aug 2022
Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines
Patrick Flynn
T. Vanderbruggen
C. Liao
Pei-Hung Lin
M. Emani
Xipeng Shen
80
4
0
11 Aug 2022
Reducing Retraining by Recycling Parameter-Efficient Prompts
Brian Lester
Joshua Yurtsever
Siamak Shakeri
Noah Constant
51
12
0
10 Aug 2022
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
CLIP
VLM
180
108
0
10 Aug 2022
Can Brain Signals Reveal Inner Alignment with Human Languages?
William Jongwon Han
Jielin Qiu
Jiacheng Zhu
Mengdi Xu
Douglas Weber
Yue Liu
Ding Zhao
119
13
0
10 Aug 2022
CoditT5: Pretraining for Source Code and Natural Language Editing
Jiyang Zhang
Sheena Panthaplackel
Pengyu Nie
Junyi Jessy Li
Miloš Gligorić
KELM
93
92
0
10 Aug 2022
Generative Action Description Prompts for Skeleton-based Action Recognition
Wangmeng Xiang
Chong Li
Yuxuan Zhou
Biao Wang
Lei Zhang
96
36
0
10 Aug 2022
Limitations of Language Models in Arithmetic and Symbolic Induction
Jingu Qian
Hong Wang
Zekun Li
Shiyang Li
Xifeng Yan
ReLM
LRM
139
76
0
09 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time
Yichuan Deng
Han Hu
Zhao Song
Omri Weinstein
Danyang Zhuo
BDL
92
28
0
09 Aug 2022
A Theoretical View on Sparsely Activated Networks
Cenk Baykal
Nishanth Dikkala
Rina Panigrahy
Cyrus Rashtchian
Xin Wang
28
11
0
08 Aug 2022
Txt2Img-MHN: Remote Sensing Image Generation from Text Using Modern Hopfield Networks
Yonghao Xu
Weikang Yu
Pedram Ghamisi
Michael K Kopp
Sepp Hochreiter
66
34
0
08 Aug 2022
Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints
Jose Gallego-Posada
Juan Ramirez
Akram Erraqabi
Yoshua Bengio
Simon Lacoste-Julien
152
22
0
08 Aug 2022
Learning to Learn to Predict Performance Regressions in Production at Meta
M. Beller
Hongyu Li
V. Nair
V. Murali
Imad Ahmad
Jürgen Cito
Drew Carlson
Gareth Ari Aye
Wes Dyer
63
5
0
08 Aug 2022
Investigating Efficiently Extending Transformers for Long Input Summarization
Jason Phang
Yao-Min Zhao
Peter J. Liu
RALM
LLMAG
85
63
0
08 Aug 2022
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Ting-Li Chen
Ruixiang Zhang
Geoffrey E. Hinton
DiffM
125
313
0
08 Aug 2022
Abstractive Meeting Summarization: A Survey
Virgile Rennard
Guokan Shang
Julie Hunter
Michalis Vazirgiannis
101
16
0
08 Aug 2022
Social Simulacra: Creating Populated Prototypes for Social Computing Systems
J. Park
Lindsay Popowski
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
85
298
0
08 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
Lefei Zhang
84
257
0
08 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
78
1
0
08 Aug 2022
On Transfer of Adversarial Robustness from Pretraining to Downstream Tasks
Laura Fee Nern
Harsh Raj
Maurice Georgi
Yash Sharma
AAML
97
4
0
07 Aug 2022
Frozen CLIP Models are Efficient Video Learners
Ziyi Lin
Shijie Geng
Renrui Zhang
Peng Gao
Gerard de Melo
Xiaogang Wang
Jifeng Dai
Yu Qiao
Hongsheng Li
CLIP
VLM
98
209
0
06 Aug 2022
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
Margaret Li
Suchin Gururangan
Tim Dettmers
M. Lewis
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoMe
110
154
0
05 Aug 2022
A Holistic Approach to Undesired Content Detection in the Real World
Todor Markov
Chong Zhang
Sandhini Agarwal
Tyna Eloundou
Teddy Lee
Steven Adler
Angela Jiang
L. Weng
125
237
0
05 Aug 2022
Previous
1
2
3
...
189
190
191
...
245
246
247
Next