Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,431 papers shown
Title
Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks
Kaiser Sun
Peng Qi
Yuhao Zhang
Lan Liu
William Yang Wang
Zhiheng Huang
80
9
0
19 Dec 2022
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
83
14
0
19 Dec 2022
Improved Long-Form Spoken Language Translation with Large Language Models
Arya D. McCarthy
Haotong Zhang
Shankar Kumar
Felix Stahlberg
Axel H. Ng
71
2
0
19 Dec 2022
A Comparative Study on Textual Saliency of Styles from Eye Tracking, Annotations, and Language Models
Karin de Langis
Dongyeop Kang
103
1
0
19 Dec 2022
Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
Xinxi Lyu
Sewon Min
Iz Beltagy
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
75
68
0
19 Dec 2022
Synthetic Pre-Training Tasks for Neural Machine Translation
Zexue He
Graeme W. Blackwood
Yikang Shen
Julian McAuley
Rogerio Feris
54
4
0
19 Dec 2022
Training Trajectories of Language Models Across Scales
Mengzhou Xia
Mikel Artetxe
Chunting Zhou
Xi Lin
Ramakanth Pasunuru
Danqi Chen
Luke Zettlemoyer
Ves Stoyanov
AIFin
LRM
98
64
0
19 Dec 2022
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
175
2,440
0
19 Dec 2022
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
108
102
0
19 Dec 2022
DSI++: Updating Transformer Memory with New Documents
Sanket Vaibhav Mehta
Jai Gupta
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
J. Rao
Marc Najork
Emma Strubell
Donald Metzler
CLL
103
46
0
19 Dec 2022
Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
Yu Gu
Xiang Deng
Yu-Chuan Su
LLMAG
123
58
0
19 Dec 2022
A Retrieve-and-Read Framework for Knowledge Graph Link Prediction
Vardaan Pahuja
Boshi Wang
Hugo Latapie
Jayanth Srinivasa
Yu-Chuan Su
76
13
0
19 Dec 2022
On Event Individuation for Document-Level Information Extraction
William Gantt
Reno Kriz
Yunmo Chen
Siddharth Vashishtha
Aaron Steven White
69
2
0
19 Dec 2022
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
Or Honovich
Thomas Scialom
Omer Levy
Timo Schick
ALM
167
374
0
19 Dec 2022
Multilingual Sequence-to-Sequence Models for Hebrew NLP
Matan Eyal
Hila Noga
Roee Aharoni
Idan Szpektor
Reut Tsarfaty
47
4
0
19 Dec 2022
StyleFlow: Disentangle Latent Representations via Normalizing Flow for Unsupervised Text Style Transfer
Kangchen Zhu
Zhiliang Tian
Ruifeng Luo
Xiaoguang Mao
OOD
105
3
0
19 Dec 2022
Visconde: Multi-document QA with GPT-3 and Neural Reranking
Jayr Pereira
R. Fidalgo
R. Lotufo
Rodrigo Nogueira
BDL
RALM
78
33
0
19 Dec 2022
Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages
Ercong Nie
Sheng Liang
Helmut Schmid
Hinrich Schütze
VLM
RALM
LRM
114
22
0
19 Dec 2022
Optimizing Prompts for Text-to-Image Generation
Y. Hao
Zewen Chi
Li Dong
Furu Wei
125
152
0
19 Dec 2022
Explanation Regeneration via Information Bottleneck
Qintong Li
Zhiyong Wu
Lingpeng Kong
Wei Bi
93
4
0
19 Dec 2022
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao
Yixin Ou
Ningyu Zhang
Xiang Chen
Yunzhi Yao
Shumin Deng
Chuanqi Tan
Fei Huang
Huajun Chen
ReLM
ELM
LRM
232
327
0
19 Dec 2022
Latent Diffusion for Language Generation
Justin Lovelace
Varsha Kishore
Chao-gang Wan
Eliot Shekhtman
Kilian Q. Weinberger
DiffM
132
82
0
19 Dec 2022
Medical Knowledge Graph QA for Drug-Drug Interaction Prediction based on Multi-hop Machine Reading Comprehension
Peng Gao
Feng Gao
Jiancheng Ni
Yu Wang
Fei Wang
62
3
0
19 Dec 2022
AI Art in Architecture
J. Ploennigs
Markus Berger
DiffM
81
72
0
19 Dec 2022
Review of security techniques for memristor computing systems
Minhui Zou
Nan Du
Shahar Kvatinsky
AAML
26
7
0
19 Dec 2022
E-NER -- An Annotated Named Entity Recognition Corpus of Legal Text
Ting Wai Terence Au
Ingemar J. Cox
Vasileios Lampos
AILaw
72
28
0
19 Dec 2022
MIGA: A Unified Multi-task Generation Framework for Conversational Text-to-SQL
Yingwen Fu
Wenjie Ou
Zhou Yu
Yue Lin
75
7
0
19 Dec 2022
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
Bairu Hou
J. O'Connor
Jacob Andreas
Shiyu Chang
Yang Zhang
VLM
57
44
0
19 Dec 2022
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
102
407
0
19 Dec 2022
Natural Language to Code Generation in Interactive Data Science Notebooks
Pengcheng Yin
Wen-Ding Li
Kefan Xiao
Abhishek Rao
Yeming Wen
...
Paige Bailey
Michele Catasta
Henryk Michalewski
Oleksandr Polozov
Charles Sutton
85
66
0
19 Dec 2022
ColoristaNet for Photorealistic Video Style Transfer
Xiaowen Qiu
Ruize Xu
Boan He
Yingtao Zhang
Wenqiang Zhang
Weifeng Ge
59
0
0
19 Dec 2022
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
103
34
0
19 Dec 2022
Emergent Analogical Reasoning in Large Language Models
Taylor Webb
K. Holyoak
Hongjing Lu
ReLM
ELM
LRM
AI4CE
112
320
0
19 Dec 2022
Rainproof: An Umbrella To Shield Text Generators From Out-Of-Distribution Data
Maxime Darrin
Pablo Piantanida
Pierre Colombo
OODD
222
15
0
18 Dec 2022
Medical Diagnosis with Large Scale Multimodal Transformers: Leveraging Diverse Data for More Accurate Diagnosis
Firas Khader
Gustav Mueller-Franzes
Tian Wang
T. Han
Soroosh Tayebi Arasteh
...
Keno Bressem
Christiane Kuhl
S. Nebelung
Jakob Nikolas Kather
Daniel Truhn
32
6
0
18 Dec 2022
JEMMA: An Extensible Java Dataset for ML4Code Applications
Anjan Karmakar
Miltiadis Allamanis
Romain Robbes
VLM
55
3
0
18 Dec 2022
Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale
Hritik Bansal
Karthik Gopalakrishnan
Saket Dingliwal
S. Bodapati
Katrin Kirchhoff
Dan Roth
LRM
90
51
0
18 Dec 2022
On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization
Shiji Xin
Yifei Wang
Jingtong Su
Yisen Wang
OOD
92
7
0
18 Dec 2022
Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons & Dragons
Akila Peiris
Nisansa de Silva
54
5
0
18 Dec 2022
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
77
7
0
18 Dec 2022
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?
Ajay Patel
Nicholas Andrews
Chris Callison-Burch
75
7
0
18 Dec 2022
Language model acceptability judgements are not always robust to context
Koustuv Sinha
Jon Gauthier
Aaron Mueller
Kanishka Misra
Keren Fuentes
R. Levy
Adina Williams
90
18
0
18 Dec 2022
Foundation models in brief: A historical, socio-technical focus
Johannes Schneider
VLM
71
9
0
17 Dec 2022
Controlling Styles in Neural Machine Translation with Activation Prompt
Yifan Wang
Zewei Sun
Shanbo Cheng
Weiguo Zheng
Mingxuan Wang
87
10
0
17 Dec 2022
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Jifan Chen
Yuhao Zhang
Lan Liu
Rui Dong
Xinchi Chen
Patrick Ng
William Yang Wang
Zhiheng Huang
AI4CE
69
4
0
17 Dec 2022
Neural Story Planning
Anbang Ye
Christopher Cui
Taiwei Shi
Mark O. Riedl
59
8
0
16 Dec 2022
Rarely a problem? Language models exhibit inverse scaling in their predictions following few-type quantifiers
J. Michaelov
Benjamin Bergen
44
17
0
16 Dec 2022
Evaluating Step-by-Step Reasoning through Symbolic Verification
Yi-Fan Zhang
Hanlin Zhang
Li Erran Li
Eric P. Xing
ReLM
LRM
92
8
0
16 Dec 2022
Plansformer: Generating Symbolic Plans using Transformers
Vishal Pallagani
Bharath Muppasani
K. Murugesan
F. Rossi
L. Horesh
Biplav Srivastava
F. Fabiano
Andrea Loreggia
LM&Ro
LLMAG
OffRL
74
38
0
16 Dec 2022
Hippocampus-Inspired Cognitive Architecture (HICA) for Operant Conditioning
Deokgun Park
Md Ashaduzzaman Rubel Mondol
Sm Mazharul Islam
Aishwarya Pothula
45
0
0
16 Dec 2022
Previous
1
2
3
...
170
171
172
...
247
248
249
Next