Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 12,362 papers shown
Title
Federated Learning for Chronic Obstructive Pulmonary Disease Classification with Partial Personalized Attention Mechanism
Yiqing Shen
Baiyun Liu
Ruize Yu
Yudong Wang
Shaokang Wang
Jiangfen Wu
Weidao Chen
56
5
0
28 Oct 2022
BRATsynthetic: Text De-identification using a Markov Chain Replacement Strategy for Surrogate Personal Identifying Information
J. D. Osborne
Tobias O'Leary
A. Nadimpalli
S. Aly
Richard Kennedy
23
1
0
28 Oct 2022
Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia
Haojie Pan
Zepeng Zhai
Yuzhou Zhang
Ruiji Fu
Ming Liu
Yangqiu Song
Zhongyuan Wang
Bing Qin
96
6
0
28 Oct 2022
UPainting: Unified Text-to-Image Diffusion Generation with Cross-modal Guidance
Wei Li
Xue Xu
Xinyan Xiao
Jiacheng Liu
Hu Yang
...
Zhanpeng Wang
Zhifan Feng
Qiaoqiao She
Yajuan Lyu
Hua Wu
232
30
0
28 Oct 2022
Differentially Private CutMix for Split Learning with Vision Transformer
Seungeun Oh
Jihong Park
Sihun Baek
Hyelin Nam
Praneeth Vepakomma
Ramesh Raskar
M. Bennis
Seong-Lyun Kim
FedML
80
18
0
28 Oct 2022
Stanceosaurus: Classifying Stance Towards Multilingual Misinformation
Jonathan Zheng
Ashutosh Baheti
Tarek Naous
Wei Xu
Alan Ritter
101
13
0
28 Oct 2022
On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Atsushi Ando
Ryo Masumura
Akihiko Takashima
Satoshi Suzuki
Naoki Makishima
Keita Suzuki
Takafumi Moriya
Takanori Ashihara
Hiroshi Sato
96
9
0
28 Oct 2022
When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels
Weiyan Shi
Emily Dinan
Kurt Shuster
Jason Weston
Jing Xu
116
20
0
28 Oct 2022
Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion
Georgios Chochlakis
Gireesh Mahajan
Sabyasachee Baruah
Keith Burghardt
Kristina Lerman
Shrikanth Narayanan
93
24
0
28 Oct 2022
Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report
Michael L. Littman
Ifeoma Ajunwa
G. Berger
Craig Boutilier
Morgan E. Currie
...
Melanie Mitchell
J. Shah
S. Sloman
Shannon Vallor
T. Walsh
ELM
96
101
0
27 Oct 2022
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation
Krishna Srinivasan
K. Raman
Anupam Samanta
Ling-Yen Liao
L. Bertelli
Michael Bendersky
RALM
LRM
81
20
0
27 Oct 2022
What Language Model to Train if You Have One Million GPU Hours?
Teven Le Scao
Thomas Wang
Daniel Hesslow
Lucile Saulnier
Stas Bekman
...
Lintang Sutawika
Jaesung Tae
Zheng-Xin Yong
Julien Launay
Iz Beltagy
MoE
AI4CE
318
109
0
27 Oct 2022
Towards Language-driven Scientific AI
José Manuél Gómez-Pérez
49
0
0
27 Oct 2022
Can language models handle recursively nested grammatical structures? A case study on comparing models and humans
Andrew Kyle Lampinen
ReLM
ELM
121
36
0
27 Oct 2022
How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?
Hritik Bansal
Da Yin
Masoud Monajatipoor
Kai-Wei Chang
116
103
0
27 Oct 2022
Fast DistilBERT on CPUs
Haihao Shen
Ofir Zafrir
Bo Dong
Hengyu Meng
Xinyu. Ye
Zhe Wang
Yi Ding
Hanwen Chang
Guy Boudoukh
Moshe Wasserblat
VLM
60
2
0
27 Oct 2022
BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text Classification
Ziwen Liu
J. Grau-Bové
Scott Orr
76
1
0
27 Oct 2022
COCO-DR: Combating Distribution Shifts in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
Yue Yu
Chenyan Xiong
Si Sun
Chao Zhang
Arnold Overwijk
VLM
OOD
145
21
0
27 Oct 2022
Truncation Sampling as Language Model Desmoothing
John Hewitt
Christopher D. Manning
Percy Liang
BDL
91
84
0
27 Oct 2022
Conversing with Copilot: Exploring Prompt Engineering for Solving CS1 Problems Using Natural Language
Paul Denny
Viraj Kumar
Nasser Giacaman
78
249
0
27 Oct 2022
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma
Yu-Hao Yang
Yanfeng Wang
Ya Zhang
Weidi Xie
VLM
74
48
0
27 Oct 2022
TRScore: A Novel GPT-based Readability Scorer for ASR Segmentation and Punctuation model evaluation and selection
Piyush Behre
S.S. Tan
A. Shah
Harini Kesavamoorthy
Shuangyu Chang
Fei Zuo
C. Basoglu
Sayan D. Pathak
80
0
0
27 Oct 2022
Contrastive Decoding: Open-ended Text Generation as Optimization
Xiang Lisa Li
Ari Holtzman
Daniel Fried
Percy Liang
Jason Eisner
Tatsunori Hashimoto
Luke Zettlemoyer
M. Lewis
156
374
0
27 Oct 2022
Active Countermeasures for Email Fraud
Wentao Chen
Fuzhou Wang
Matthew Edwards
73
5
0
26 Oct 2022
Privately Fine-Tuning Large Language Models with Differential Privacy
R. Behnia
Mohammadreza Ebrahimi
Jason L. Pacheco
B. Padmanabhan
125
51
0
26 Oct 2022
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
145
76
0
26 Oct 2022
Multi-lingual Evaluation of Code Generation Models
Ben Athiwaratkun
Sanjay Krishna Gouda
Zijian Wang
Xiaopeng Li
Yuchen Tian
...
Baishakhi Ray
Parminder Bhatia
Sudipta Sengupta
Dan Roth
Bing Xiang
ELM
191
177
0
26 Oct 2022
A Case for Business Process-Specific Foundation Models
Sadhana Kumaravel
Praveen Venkateswaran
Vatche Isahagian
Vinod Muthusamy
AI4CE
68
9
0
26 Oct 2022
Pretrained audio neural networks for Speech emotion recognition in Portuguese
M. Gauy
Marcelo Finger
39
4
0
26 Oct 2022
Leveraging Demonstrations with Latent Space Priors
Jonas Gehring
Deepak Gopinath
Jungdam Won
Andreas Krause
Gabriel Synnaeve
Nicolas Usunier
72
6
0
26 Oct 2022
MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective
Zhe Hu
Hou Pong Chan
Lifu Huang
95
8
0
26 Oct 2022
Analyzing Multi-Task Learning for Abstractive Text Summarization
Frederic Kirstein
Jan Philip Wahle
Terry Ruas
Bela Gipp
75
4
0
26 Oct 2022
Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning
Yifan Chen
Devamanyu Hazarika
Mahdi Namazifar
Yang Liu
Di Jin
Dilek Z. Hakkani-Tür
63
4
0
26 Oct 2022
Will we run out of data? Limits of LLM scaling based on human-generated data
Pablo Villalobos
A. Ho
J. Sevilla
T. Besiroglu
Lennart Heim
Marius Hobbhahn
ALM
102
125
0
26 Oct 2022
Piloting Copilot, Codex, and StarCoder2: Hot Temperature, Cold Prompts, or Black Magic?
Jean-Baptiste Döderlein
Nguessan Hermann Kouadio
M. Acher
D. Khelladi
B. Combemale
92
36
0
26 Oct 2022
RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering
Victor Zhong
Weijia Shi
Wen-tau Yih
Luke Zettlemoyer
106
21
0
25 Oct 2022
Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe
Xiang Yue
Huseyin A. Inan
Xuechen Li
Girish Kumar
Julia McAnallen
Hoda Shajari
Huan Sun
David Levitan
Robert Sim
152
86
0
25 Oct 2022
A single-cell gene expression language model
Will Connell
Umair W Khan
Michael J. Keiser
43
9
0
25 Oct 2022
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models
Aaron Mueller
Yudi Xia
Tal Linzen
MILM
113
10
0
25 Oct 2022
Reading Between the Lines: Modeling User Behavior and Costs in AI-Assisted Programming
Hussein Mozannar
Gagan Bansal
Adam Fourney
Eric Horvitz
173
118
0
25 Oct 2022
OpenStance: Real-world Zero-shot Stance Detection
Hanzi Xu
Slobodan Vučetić
Wenpeng Yin
65
22
0
25 Oct 2022
Classification and Self-Supervised Regression of Arrhythmic ECG Signals Using Convolutional Neural Networks
B. Grabowski
P. Głomb
Wojciech Masarczyk
Pawel Plawiak
Ozal Yildirim
U. Acharya
Ruyan Tan
55
3
0
25 Oct 2022
Exploring Document-Level Literary Machine Translation with Parallel Paragraphs from World Literature
Katherine Thai
Marzena Karpinska
Kalpesh Krishna
Bill Ray
M. Inghilleri
John Wieting
Mohit Iyyer
71
48
0
25 Oct 2022
In-context Reinforcement Learning with Algorithm Distillation
Michael Laskin
Luyu Wang
Junhyuk Oh
Emilio Parisotto
Stephen Spencer
...
Ethan A. Brooks
Maxime Gazeau
Himanshu Sahni
Satinder Singh
Volodymyr Mnih
OffRL
82
133
0
25 Oct 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
135
55
0
25 Oct 2022
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Maximillian Chen
Alexandros Papangelis
Chenyang Tao
Andrew Rosenbaum
Seokhwan Kim
Yang Liu
Zhou Yu
Dilek Z. Hakkani-Tür
108
35
0
25 Oct 2022
Contrastive Search Is What You Need For Neural Text Generation
Yixuan Su
Nigel Collier
91
53
0
25 Oct 2022
IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models
Chenguang Wang
Xiao Liu
Dawn Song
VLM
41
2
0
25 Oct 2022
Exploring Mode Connectivity for Pre-trained Language Models
Yujia Qin
Cheng Qian
Jing Yi
Weize Chen
Yankai Lin
Xu Han
Zhiyuan Liu
Maosong Sun
Jie Zhou
95
21
0
25 Oct 2022
Gradient-based Weight Density Balancing for Robust Dynamic Sparse Training
Mathias Parger
Alexander Ertl
Paul Eibensteiner
J. H. Mueller
Martin Winter
M. Steinberger
54
0
0
25 Oct 2022
Previous
1
2
3
...
178
179
180
...
246
247
248
Next