ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,273 papers shown
Title
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
367
997
0
27 Jan 2021
Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged
  Gradient Method for Stochastic Optimization
Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization
Aaron Defazio
Samy Jelassi
ODL
62
69
0
26 Jan 2021
An Efficient Statistical-based Gradient Compression Technique for
  Distributed Training Systems
An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems
A. Abdelmoniem
Ahmed Elzanaty
Mohamed-Slim Alouini
Marco Canini
123
77
0
26 Jan 2021
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin
Bilge Acun
Xing Liu
Carole-Jean Wu
99
106
0
25 Jan 2021
Curriculum Learning: A Survey
Curriculum Learning: A Survey
Petru Soviany
Radu Tudor Ionescu
Paolo Rota
N. Sebe
ODL
186
363
0
25 Jan 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
233
708
0
24 Jan 2021
Training Multilingual Pre-trained Language Model with Byte-level
  Subwords
Training Multilingual Pre-trained Language Model with Byte-level Subwords
Junqiu Wei
Qun Liu
Yinpeng Guo
Xin Jiang
58
20
0
23 Jan 2021
The Impact of Multiple Parallel Phrase Suggestions on Email Input and
  Composition Behaviour of Native and Non-Native English Writers
The Impact of Multiple Parallel Phrase Suggestions on Email Input and Composition Behaviour of Native and Non-Native English Writers
Daniel Buschek
Martin Zurn
Malin Eiband
170
106
0
22 Jan 2021
Will Artificial Intelligence supersede Earth System and Climate Models?
Will Artificial Intelligence supersede Earth System and Climate Models?
C. Irrgang
Niklas Boers
Maike Sonnewald
E. Barnes
C. Kadow
J. Staneva
J. Saynisch‐Wagner
AI4ClAI4CE
80
195
0
22 Jan 2021
Zero-shot Generalization in Dialog State Tracking through Generative
  Question Answering
Zero-shot Generalization in Dialog State Tracking through Generative Question Answering
Shuyang Li
Jin Cao
Mukund Sridhar
Henghui Zhu
Shang-Wen Li
Wael Hamza
Julian McAuley
BDL
72
46
0
20 Jan 2021
Word Alignment by Fine-tuning Embeddings on Parallel Corpora
Word Alignment by Fine-tuning Embeddings on Parallel Corpora
Zi-Yi Dou
Graham Neubig
181
271
0
20 Jan 2021
Towards Facilitating Empathic Conversations in Online Mental Health
  Support: A Reinforcement Learning Approach
Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach
Ashish Sharma
Inna Wanyin Lin
Adam S. Miner
David C. Atkins
Tim Althoff
AI4MH
118
149
0
19 Jan 2021
The Next Decade of Telecommunications Artificial Intelligence
The Next Decade of Telecommunications Artificial Intelligence
Ouyang Ye
Lilei Wang
Aidong Yang
Le Su
David Belanger
Tongqing Gao
Leping Wei
Yaqin Zhang
59
17
0
19 Jan 2021
Diagnostic Captioning: A Survey
Diagnostic Captioning: A Survey
John Pavlopoulos
Vasiliki Kougia
Ion Androutsopoulos
D. Papamichail
3DVMedIm
153
29
0
18 Jan 2021
Machine learning with limited data
Machine learning with limited data
Fupin Yao
VLM
57
8
0
18 Jan 2021
Model Compression for Domain Adaptation through Causal Effect Estimation
Model Compression for Domain Adaptation through Causal Effect Estimation
Guy Rotman
Amir Feder
Roi Reichart
CML
90
7
0
18 Jan 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
281
434
0
18 Jan 2021
What Makes Good In-Context Examples for GPT-$3$?
What Makes Good In-Context Examples for GPT-333?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAMLRALM
393
1,400
0
17 Jan 2021
Understanding in Artificial Intelligence
Understanding in Artificial Intelligence
S. Maetschke
D. M. Iraola
Pieter Barnard
Elaheh Shafieibavani
Peter Zhong
Ying Xu
Antonio Jimeno Yepes
ELMVLM
44
0
0
17 Jan 2021
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
Yuhuai Wu
M. Rabe
Wenda Li
Jimmy Ba
Roger C. Grosse
Christian Szegedy
AIMatLRM
139
57
0
15 Jan 2021
Counterfactual Generative Networks
Counterfactual Generative Networks
Axel Sauer
Andreas Geiger
OODBDLCML
102
127
0
15 Jan 2021
Supervised Transfer Learning at Scale for Medical Imaging
Supervised Transfer Learning at Scale for Medical Imaging
Basil Mustafa
Aaron Loh
Jan Freyberg
Patricia MacWilliams
Megan Wilson
...
Shruthi Prabhakara
Umesh Telang
Alan Karthikesalingam
N. Houlsby
Vivek Natarajan
LM&MA
159
68
0
14 Jan 2021
Persistent Anti-Muslim Bias in Large Language Models
Persistent Anti-Muslim Bias in Large Language Models
Abubakar Abid
Maheen Farooqi
James Zou
AILaw
114
560
0
14 Jan 2021
Structured Prediction as Translation between Augmented Natural Languages
Structured Prediction as Translation between Augmented Natural Languages
Giovanni Paolini
Ben Athiwaratkun
Jason Krone
Jie Ma
Alessandro Achille
Rishita Anubhai
Cicero Nogueira dos Santos
Bing Xiang
Stefano Soatto
90
295
0
14 Jan 2021
GAN Inversion: A Survey
GAN Inversion: A Survey
Weihao Xia
Yulun Zhang
Yujiu Yang
Jing-Hao Xue
Bolei Zhou
Ming-Hsuan Yang
DiffM
237
520
0
14 Jan 2021
Training Data Leakage Analysis in Language Models
Training Data Leakage Analysis in Language Models
Huseyin A. Inan
Osman Ramadan
Lukas Wutschitz
Daniel Jones
Victor Rühle
James Withers
Robert Sim
MIACVPILM
98
9
0
14 Jan 2021
Robustness Gym: Unifying the NLP Evaluation Landscape
Robustness Gym: Unifying the NLP Evaluation Landscape
Karan Goel
Nazneen Rajani
Jesse Vig
Samson Tan
Jason M. Wu
Stephan Zheng
Caiming Xiong
Joey Tianyi Zhou
Christopher Ré
AAMLOffRLOOD
194
140
0
13 Jan 2021
Of Non-Linearity and Commutativity in BERT
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
101
17
0
12 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural
  Networks
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
68
3
0
12 Jan 2021
Deeplite Neutrino: An End-to-End Framework for Constrained Deep Learning
  Model Optimization
Deeplite Neutrino: An End-to-End Framework for Constrained Deep Learning Model Optimization
A. Sankaran
Olivier Mastropietro
Ehsan Saboori
Yasser Idris
Davis Sawyer
Mohammadhossein Askarihemmat
G. B. Hacene
73
4
0
11 Jan 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
104
2,243
0
11 Jan 2021
Investigating the Vision Transformer Model for Image Retrieval Tasks
Investigating the Vision Transformer Model for Image Retrieval Tasks
S. Gkelios
Y. Boutalis
S. Chatzichristofis
VLMViT
75
30
0
11 Jan 2021
Misspelling Correction with Pre-trained Contextual Language Model
Misspelling Correction with Pre-trained Contextual Language Model
Yifei Hu
X. Jing
Youlim Ko
Julia Taylor Rayz
KELM
90
28
0
08 Jan 2021
Learning quantum data with the quantum Earth Mover's distance
Learning quantum data with the quantum Earth Mover's distance
B. Kiani
Giacomo De Palma
M. Marvian
Zi-Wen Liu
S. Lloyd
91
47
0
08 Jan 2021
Ask2Transformers: Zero-Shot Domain labelling with Pre-trained Language
  Models
Ask2Transformers: Zero-Shot Domain labelling with Pre-trained Language Models
Oscar Sainz
German Rigau
VLM
73
22
0
07 Jan 2021
LightLayers: Parameter Efficient Dense and Convolutional Layers for
  Image Classification
LightLayers: Parameter Efficient Dense and Convolutional Layers for Image Classification
Debesh Jha
Anis Yazidi
Michael A. Riegler
Dag Johansen
Haavard D. Johansen
Pål Halvorsen
30
9
0
06 Jan 2021
Model Extraction and Defenses on Generative Adversarial Networks
Model Extraction and Defenses on Generative Adversarial Networks
Hailong Hu
Jun Pang
SILMMIACV
90
14
0
06 Jan 2021
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
MSD: Saliency-aware Knowledge Distillation for Multimodal Understanding
Woojeong Jin
Maziar Sanjabi
Shaoliang Nie
L Tan
Xiang Ren
Hamed Firooz
30
6
0
06 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
173
354
0
05 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
359
2,560
0
04 Jan 2021
Retrieving and Reading: A Comprehensive Survey on Open-domain Question
  Answering
Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering
Fengbin Zhu
Wenqiang Lei
Chao Wang
Jianming Zheng
Soujanya Poria
Tat-Seng Chua
RALM
273
257
0
04 Jan 2021
Few-Shot Question Answering by Pretraining Span Selection
Few-Shot Question Answering by Pretraining Span Selection
Ori Ram
Yuval Kirstain
Jonathan Berant
Amir Globerson
Omer Levy
104
98
0
02 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
83
16
0
02 Jan 2021
On-the-Fly Attention Modulation for Neural Generation
On-the-Fly Attention Modulation for Neural Generation
Yue Dong
Chandra Bhagavatula
Ximing Lu
Jena D. Hwang
Antoine Bosselut
Jackie C.K. Cheung
Yejin Choi
125
13
0
02 Jan 2021
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Jeff Da
Ronan Le Bras
Ximing Lu
Yejin Choi
Antoine Bosselut
AI4MHKELM
175
41
0
01 Jan 2021
Rider: Reader-Guided Passage Reranking for Open-Domain Question
  Answering
Rider: Reader-Guided Passage Reranking for Open-Domain Question Answering
Yuning Mao
Pengcheng He
Xiaodong Liu
Yelong Shen
Jianfeng Gao
Jiawei Han
Weizhu Chen
OODLRM
196
37
0
01 Jan 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
252
4,328
0
01 Jan 2021
WARP: Word-level Adversarial ReProgramming
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
337
354
0
01 Jan 2021
Studying Strategically: Learning to Mask for Closed-book QA
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Wen-tau Yih
Xiang Ren
Madian Khabsa
OffRL
79
12
0
31 Dec 2020
Shortformer: Better Language Modeling using Shorter Inputs
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
299
91
0
31 Dec 2020
Previous
123...237238239...244245246
Next