Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 11,214 papers shown
Title
How does a Pre-Trained Transformer Integrate Contextual Keywords? Application to Humanitarian Computing
Valentin Barrière
Guillaume Jacquet
16
1
0
07 Nov 2021
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
194
387
0
06 Nov 2021
Exploiting a Zoo of Checkpoints for Unseen Tasks
Jiaji Huang
Qiang Qiu
Kenneth Church
27
4
0
05 Nov 2021
Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers
Yanhong Zeng
Huan Yang
Hongyang Chao
Jianbo Wang
Jianlong Fu
ViT
27
26
0
05 Nov 2021
LILA: Language-Informed Latent Actions
Siddharth Karamcheti
Megha Srivastava
Percy Liang
Dorsa Sadigh
LM&Ro
30
31
0
05 Nov 2021
B-Pref: Benchmarking Preference-Based Reinforcement Learning
Kimin Lee
Laura M. Smith
Anca Dragan
Pieter Abbeel
OffRL
40
93
0
04 Nov 2021
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding
Subhabrata Mukherjee
Xiaodong Liu
Guoqing Zheng
Saghar Hosseini
Hao Cheng
Greg Yang
Christopher Meek
Ahmed Hassan Awadallah
Jianfeng Gao
ELM
33
11
0
04 Nov 2021
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLM
MLLM
CLIP
48
1,380
0
03 Nov 2021
Automatic Evaluation and Moderation of Open-domain Dialogue Systems
Chen Zhang
João Sedoc
L. F. D’Haro
Rafael E. Banchs
Alexander I. Rudnicky
22
36
0
03 Nov 2021
An Explanation of In-context Learning as Implicit Bayesian Inference
Sang Michael Xie
Aditi Raghunathan
Percy Liang
Tengyu Ma
ReLM
BDL
VPVLM
LRM
79
695
0
03 Nov 2021
OpenPrompt: An Open-source Framework for Prompt-learning
Ning Ding
Shengding Hu
Weilin Zhao
Yulin Chen
Zhiyuan Liu
Haitao Zheng
Maosong Sun
VLM
LLMAG
37
284
0
03 Nov 2021
Discovering Supply Chain Links with Augmented Intelligence
Achintya Gopal
Chun-Han Chang
40
3
0
02 Nov 2021
AI Ethics Statements -- Analysis and lessons learnt from NeurIPS Broader Impact Statements
Carolyn Ashurst
Emmie Hine
Paul Sedille
A. Carlier
37
28
0
02 Nov 2021
LMdiff: A Visual Diff Tool to Compare Language Models
Hendrik Strobelt
Benjamin Hoover
Arvind Satyanarayan
Sebastian Gehrmann
VLM
37
19
0
02 Nov 2021
MultiplexNet: Towards Fully Satisfied Logical Constraints in Neural Networks
Nicholas Hoernle
Rafael-Michael Karampatsis
Vaishak Belle
Y. Gal
27
59
0
02 Nov 2021
Diverse Distributions of Self-Supervised Tasks for Meta-Learning in NLP
Trapit Bansal
K. Gunasekaran
Tong Wang
Tsendsuren Munkhdalai
Andrew McCallum
SSL
OOD
51
19
0
02 Nov 2021
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MA
VLM
AI4CE
83
1,038
0
01 Nov 2021
Towards Language Modelling in the Speech Domain Using Sub-word Linguistic Units
Anurag Katakkar
A. Black
AuLLM
30
1
0
31 Oct 2021
DAdaQuant: Doubly-adaptive quantization for communication-efficient Federated Learning
Robert Hönig
Yiren Zhao
Robert D. Mullins
FedML
109
54
0
31 Oct 2021
All-In-One: Artificial Association Neural Networks
Seokjun Kim
Jaeeun Jang
Hyeoncheol Kim
32
0
0
31 Oct 2021
MetaICL: Learning to Learn In Context
Sewon Min
M. Lewis
Luke Zettlemoyer
Hannaneh Hajishirzi
LRM
72
470
0
29 Oct 2021
Self-Supervised Learning Disentangled Group Representation as Feature
Tan Wang
Zhongqi Yue
Jianqiang Huang
Qianru Sun
Hanwang Zhang
OOD
36
67
0
28 Oct 2021
Semi-Siamese Bi-encoder Neural Ranking Model Using Lightweight Fine-Tuning
Euna Jung
Jaekeol Choi
Wonjong Rhee
22
13
0
28 Oct 2021
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
Yongbin Li
Hongxin Liu
Zhengda Bian
Boxiang Wang
Haichen Huang
Fan Cui
Chuan-Qing Wang
Yang You
GNN
30
143
0
28 Oct 2021
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
73
3,886
0
27 Oct 2021
Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving Adversarial Outcomes
Sanghyun Hong
Michael-Andrei Panaitescu-Liess
Yigitcan Kaya
Tudor Dumitras
MQ
60
13
0
26 Oct 2021
Distributionally Robust Recurrent Decoders with Random Network Distillation
Antonio Valerio Miceli Barone
Alexandra Birch
Rico Sennrich
39
1
0
25 Oct 2021
Demystifying and Generalizing BinaryConnect
Abhishek Sharma
Yaoliang Yu
Eyyub Sari
Mahdi Zolnouri
V. Nia
MQ
22
8
0
25 Oct 2021
Unsupervised Source Separation By Steering Pretrained Music Models
Ethan Manilow
P. O'Reilly
Prem Seetharaman
Bryan Pardo
24
2
0
25 Oct 2021
AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning
Siddharth Singh
A. Bhatele
GNN
34
14
0
25 Oct 2021
CoProtector: Protect Open-Source Code against Unauthorized Training Usage with Data Poisoning
Zhensu Sun
Xiaoning Du
Fu Song
Mingze Ni
Li Li
30
68
0
25 Oct 2021
Practical Galaxy Morphology Tools from Deep Supervised Representation Learning
Mike Walmsley
Anna M. M. Scaife
Chris J. Lintott
Michelle Lochner
Yu Zhu
...
Xibo Ma
Sandor Kruk
Zhen Lei
G. Guo
B. Simmons
24
29
0
25 Oct 2021
Contrastively Disentangled Sequential Variational Autoencoder
M. Kiener
Weiran Wang
Michael Gerndt
CoGe
DRL
32
40
0
22 Oct 2021
When to Prune? A Policy towards Early Structural Pruning
Maying Shen
Pavlo Molchanov
Hongxu Yin
J. Álvarez
VLM
28
53
0
22 Oct 2021
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
18
161
0
22 Oct 2021
Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model
A. Bodin
N. Macris
37
13
0
22 Oct 2021
Sinkformers: Transformers with Doubly Stochastic Attention
Michael E. Sander
Pierre Ablin
Mathieu Blondel
Gabriel Peyré
32
77
0
22 Oct 2021
Wav2CLIP: Learning Robust Audio Representations From CLIP
Ho-Hsiang Wu
Prem Seetharaman
Kundan Kumar
J. P. Bello
CLIP
VLM
48
268
0
21 Oct 2021
MixNorm: Test-Time Adaptation Through Online Normalization Estimation
Xuefeng Hu
M. Uzunbas
Sirius Chen
Rui Wang
Ashish Shah
Ram Nevatia
Ser-Nam Lim
TTA
28
47
0
21 Oct 2021
Fast Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Chelsea Finn
Christopher D. Manning
KELM
230
343
0
21 Oct 2021
Transformer Acceleration with Dynamic Sparse Attention
Liu Liu
Zheng Qu
Zhaodong Chen
Yufei Ding
Yuan Xie
19
20
0
21 Oct 2021
HALP: Hardware-Aware Latency Pruning
Maying Shen
Hongxu Yin
Pavlo Molchanov
Lei Mao
Jianna Liu
J. Álvarez
VLM
46
13
0
20 Oct 2021
Data-Driven Offline Optimization For Architecting Hardware Accelerators
Aviral Kumar
Amir Yazdanbakhsh
Milad Hashemi
Kevin Swersky
Sergey Levine
27
36
0
20 Oct 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
19
44
0
20 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
61
94
0
20 Oct 2021
Expressivity of Neural Networks via Chaotic Itineraries beyond Sharkovsky's Theorem
Clayton Sanford
Vaggos Chatziafratis
16
1
0
19 Oct 2021
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
27
117
0
19 Oct 2021
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
38
3
0
18 Oct 2021
NormFormer: Improved Transformer Pretraining with Extra Normalization
Sam Shleifer
Jason Weston
Myle Ott
AI4CE
33
74
0
18 Oct 2021
Self-Supervised Representation Learning: Introduction, Advances and Challenges
Linus Ericsson
Henry Gouk
Chen Change Loy
Timothy M. Hospedales
SSL
OOD
AI4TS
34
273
0
18 Oct 2021
Previous
1
2
3
...
206
207
208
...
223
224
225
Next