Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,301 papers shown
Title
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
107
46
0
14 Jul 2022
Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model
Chris van der Lee
Thiago Castro Ferreira
Chris Emmery
Travis J. Wiltshire
Emiel Krahmer
78
2
0
14 Jul 2022
BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling
Javier de la Rosa
E. G. Ponferrada
Paulo Villegas
Pablo González de Prado Salas
Manu Romero
María Grandury
74
95
0
14 Jul 2022
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
142
290
0
13 Jul 2022
N-Grammer: Augmenting Transformers with latent n-grams
Aurko Roy
Rohan Anil
Guangda Lai
Benjamin Lee
Jeffrey Zhao
...
Yu
Phuong Dao
Christopher Fifty
Zhiwen Chen
Yonghui Wu
77
8
0
13 Jul 2022
Re2G: Retrieve, Rerank, Generate
Michael R. Glass
Gaetano Rossiello
Md. Faisal Mahbub Chowdhury
Ankita Rajaram Naik
Pengshan Cai
A. Gliozzo
RALM
89
96
0
13 Jul 2022
Does GNN Pretraining Help Molecular Representation?
Ruoxi Sun
Hanjun Dai
Adams Wei Yu
SSL
AI4CE
GNN
65
75
0
13 Jul 2022
Wayformer: Motion Forecasting via Simple & Efficient Attention Networks
Nigamaa Nayakanti
Rami Al-Rfou
Aurick Zhou
Kratarth Goel
Khaled S. Refaat
Benjamin Sapp
AI4TS
135
259
0
12 Jul 2022
A new hope for network model generalization
Alexander Dietmüller
Siddhant Ray
Romain Jacob
Laurent Vanbever
AI4CE
95
39
0
12 Jul 2022
Vision Transformer for NeRF-Based View Synthesis from a Single Input Image
Kai-En Lin
Yen-Chen Lin
Wei-Sheng Lai
Nayeon Lee
Yichang Shih
R. Ramamoorthi
ViT
90
114
0
12 Jul 2022
Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAG
LM&Ro
LRM
184
926
0
12 Jul 2022
CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm
Mingye Xu
Yali Wang
Yihao Liu
Tong He
Yu Qiao
3DPC
86
17
0
12 Jul 2022
Bootstrapping a User-Centered Task-Oriented Dialogue System
Shijie Chen
Ziru Chen
Xiang Deng
A. Lewis
Lingbo Mo
...
Zhen Wang
Xiang Yue
Tianshu Zhang
Yu-Chuan Su
Huan Sun
50
2
0
11 Jul 2022
Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data
Jon Harris
Mohammed J Zaki
AI4TS
60
2
0
11 Jul 2022
Exploring Length Generalization in Large Language Models
Cem Anil
Yuhuai Wu
Anders Andreassen
Aitor Lewkowycz
Vedant Misra
V. Ramasesh
Ambrose Slone
Guy Gur-Ari
Ethan Dyer
Behnam Neyshabur
ReLM
LRM
111
170
0
11 Jul 2022
A clinically motivated self-supervised approach for content-based image retrieval of CT liver images
Kristoffer Wickstrøm
Eirik Agnalt Ostmo
Keyur Radiya
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
SSL
92
16
0
11 Jul 2022
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning
Homer Walke
Jonathan Yang
Albert Yu
Aviral Kumar
Jedrzej Orbik
Avi Singh
Sergey Levine
OffRL
OnRL
90
32
0
11 Jul 2022
TCR: A Transformer Based Deep Network for Predicting Cancer Drugs Response
Jie Gao
Jing Hu
Wan-Na Sun
Yili Shen
Xiaonan Zhang
Xiaomin Fang
Fan Wang
Guo-Guo Zhao
MedIm
41
2
0
10 Jul 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
260
470
0
10 Jul 2022
Radiomics-Guided Global-Local Transformer for Weakly Supervised Pathology Localization in Chest X-Rays
Yan Han
G. Holste
Ying Ding
Ahmed H. Tewfik
Yifan Peng
Zhangyang Wang
LM&MA
ViT
121
16
0
10 Jul 2022
Few-shot training LLMs for project-specific code-summarization
Toufique Ahmed
Prem Devanbu
235
241
0
09 Jul 2022
SCouT: Synthetic Counterfactuals via Spatiotemporal Transformers for Actionable Healthcare
Bhishma Dedhia
Roshini Balasubramanian
N. Jha
MedIm
46
4
0
09 Jul 2022
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling
Tung Nguyen
Aditya Grover
BDL
UQCV
87
107
0
09 Jul 2022
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations
Dylan Slack
Satyapriya Krishna
Himabindu Lakkaraju
Sameer Singh
84
83
0
08 Jul 2022
Automatic Exploration of Textual Environments with Language-Conditioned Autotelic Agents
Laetitia Teodorescu
Xingdi Yuan
Marc-Alexandre Côté
Pierre-Yves Oudeyer
LLMAG
56
0
0
08 Jul 2022
Big Learning
Yulai Cong
Miaoyun Zhao
AI4CE
91
0
0
08 Jul 2022
Boosting Zero-shot Learning via Contrastive Optimization of Attribute Representations
Yu Du
Miaojing Shi
Fangyun Wei
Guoqi Li
VLM
121
15
0
08 Jul 2022
Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation
Zejiang Hou
Julian Salazar
George Polovets
71
15
0
07 Jul 2022
Training Transformers Together
Alexander Borzunov
Max Ryabinin
Tim Dettmers
Quentin Lhoest
Lucile Saulnier
Michael Diskin
Yacine Jernite
Thomas Wolf
ViT
63
10
0
07 Jul 2022
Can Language Models perform Abductive Commonsense Reasoning?
Seung-Kyum Kim
ReLM
LRM
45
1
0
07 Jul 2022
The "Collections as ML Data" Checklist for Machine Learning & Cultural Heritage
Benjamin Charles Germain Lee
VLM
43
7
0
06 Jul 2022
When does Bias Transfer in Transfer Learning?
Hadi Salman
Saachi Jain
Andrew Ilyas
Logan Engstrom
Eric Wong
Aleksander Madry
89
37
0
06 Jul 2022
Rethinking the Value of Gazetteer in Chinese Named Entity Recognition
Qianglong Chen
Xiangji Zeng
Jiangang Zhu
Yin Zhang
Bojia Lin
Yang Yang
Daxin Jiang
55
2
0
06 Jul 2022
Don't Pay Attention to the Noise: Learning Self-supervised Representations of Light Curves with a Denoising Time Series Transformer
M. Morvan
N. Nikolaou
K. H. Yip
Ingo P. Waldmann
AI4TS
126
11
0
06 Jul 2022
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
99
201
0
06 Jul 2022
Transformers are Adaptable Task Planners
Vidhi Jain
Yixin Lin
Eric Undersander
Yonatan Bisk
Akshara Rai
113
24
0
06 Jul 2022
Robustness Analysis of Video-Language Models Against Visual and Language Perturbations
Madeline Chantry Schiappa
Shruti Vyas
Hamid Palangi
Yogesh S Rawat
Vibhav Vineet
VLM
162
20
0
05 Jul 2022
Betty: An Automatic Differentiation Library for Multilevel Optimization
Sang Keun Choe
Willie Neiswanger
P. Xie
Eric P. Xing
AI4CE
81
32
0
05 Jul 2022
Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic
Vésteinn Snæbjarnarson
H. Einarsson
57
6
0
05 Jul 2022
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Noah Hollmann
Samuel G. Müller
Katharina Eggensperger
Frank Hutter
121
317
0
05 Jul 2022
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
54
8
0
05 Jul 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
Guosheng Lin
SyDa
ALM
227
273
0
05 Jul 2022
Probing via Prompting
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
109
13
0
04 Jul 2022
Location reference recognition from texts: A survey and comparison
Xuke Hu
Zhiyong Zhou
Hao Li
Yingjie Hu
F. Gu
J. Kersten
H. Fan
Friederike Klan
55
51
0
04 Jul 2022
Comparing Feature Importance and Rule Extraction for Interpretability on Text Data
Gianluigi Lopardo
Damien Garreau
FAtt
98
1
0
04 Jul 2022
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Shunyu Yao
Howard Chen
John Yang
Karthik Narasimhan
LLMAG
LM&Ro
176
522
0
04 Jul 2022
PrUE: Distilling Knowledge from Sparse Teacher Networks
Shaopu Wang
Xiaojun Chen
Mengzhen Kou
Jinqiao Shi
106
2
0
03 Jul 2022
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
147
128
0
03 Jul 2022
Rationale-Augmented Ensembles in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Denny Zhou
ReLM
LRM
119
126
0
02 Jul 2022
Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk
Benyou Wang
Xiang Wu
Xiaokang Liu
Jianquan Li
Prayag Tiwari
Qianqian Xie
61
6
0
02 Jul 2022
Previous
1
2
3
...
191
192
193
...
245
246
247
Next