ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXivPDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 11,657 papers shown
Title
Memorization in NLP Fine-tuning Methods
Memorization in NLP Fine-tuning Methods
Fatemehsadat Mireshghallah
Archit Uniyal
Tianhao Wang
David Evans
Taylor Berg-Kirkpatrick
AAML
70
39
0
25 May 2022
Learning a Better Initialization for Soft Prompts via Meta-Learning
Learning a Better Initialization for Soft Prompts via Meta-Learning
Yukun Huang
Kun Qian
Zhou Yu
VLM
68
9
0
25 May 2022
Know Where You're Going: Meta-Learning for Parameter-Efficient
  Fine-Tuning
Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-Tuning
Mozhdeh Gheini
Xuezhe Ma
Jonathan May
62
5
0
25 May 2022
Generating Natural Language Proofs with Verifier-Guided Search
Generating Natural Language Proofs with Verifier-Guided Search
Kaiyu Yang
Jia Deng
Danqi Chen
LRM
36
69
0
25 May 2022
Learning Action Conditions from Instructional Manuals for Instruction
  Understanding
Learning Action Conditions from Instructional Manuals for Instruction Understanding
Te-Lin Wu
Caiqi Zhang
Qingyuan Hu
Alexander Spangher
Nanyun Peng
34
4
0
25 May 2022
New Intent Discovery with Pre-training and Contrastive Learning
New Intent Discovery with Pre-training and Contrastive Learning
Yuwei Zhang
Haode Zhang
Li-Ming Zhan
Albert Y.S. Lam
Albert Y. S. Lam
SSL
VLM
47
41
0
25 May 2022
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT
James Lee-Thorp
Joshua Ainslie
MoE
39
11
0
24 May 2022
Fine-tuned Language Models are Continual Learners
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
145
118
0
24 May 2022
Challenges and Opportunities in Information Manipulation Detection: An
  Examination of Wartime Russian Media
Challenges and Opportunities in Information Manipulation Detection: An Examination of Wartime Russian Media
Chan Young Park
Julia Mendelsohn
Anjalie Field
Yulia Tsvetkov
AAML
37
27
0
24 May 2022
Structured Prompt Tuning
Structured Prompt Tuning
Chi-Liang Liu
Hung-yi Lee
Wen-tau Yih
27
3
0
24 May 2022
TALM: Tool Augmented Language Models
TALM: Tool Augmented Language Models
Aaron T Parisi
Yao-Min Zhao
Noah Fiedel
KELM
RALM
LLMAG
41
144
0
24 May 2022
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained
  Language Models
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models
Da Yin
Hritik Bansal
Masoud Monajatipoor
Liunian Harold Li
Kai-Wei Chang
51
28
0
24 May 2022
PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised
  Poetry Generation
PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation
Aitor Ormazabal
Mikel Artetxe
Manex Agirrezabal
Aitor Soroa Etxabe
Eneko Agirre
29
21
0
24 May 2022
Enhancing Continual Learning with Global Prototypes: Counteracting
  Negative Representation Drift
Enhancing Continual Learning with Global Prototypes: Counteracting Negative Representation Drift
Xueying Bai
Jinghuan Shang
Yifan Sun
Niranjan Balasubramanian
CLL
35
1
0
24 May 2022
The Curious Case of Control
The Curious Case of Control
Elias Stengel-Eskin
Benjamin Van Durme
30
0
0
24 May 2022
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures
  of Soft Prompts
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
Akari Asai
Mohammadreza Salehi
Matthew E. Peters
Hannaneh Hajishirzi
130
100
0
24 May 2022
The Authenticity Gap in Human Evaluation
The Authenticity Gap in Human Evaluation
Kawin Ethayarajh
Dan Jurafsky
87
24
0
24 May 2022
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
337
4,077
0
24 May 2022
Lutma: a Frame-Making Tool for Collaborative FrameNet Development
Lutma: a Frame-Making Tool for Collaborative FrameNet Development
Tiago Timponi Torrent
Arthur Lorenzi
E. Matos
Frederico Belcavello
Marcelo Viridiano
Maucha Gamonal
22
1
0
24 May 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive
  Explanations
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
Jaehun Jung
Lianhui Qin
Sean Welleck
Faeze Brahman
Chandra Bhagavatula
Ronan Le Bras
Yejin Choi
ReLM
LRM
229
190
0
24 May 2022
Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A
  Pilot Study on Named Entity Recognition
Formulating Few-shot Fine-tuning Towards Language Model Pre-training: A Pilot Study on Named Entity Recognition
Zihan Wang
Kewen Zhao
Zilong Wang
Jingbo Shang
46
6
0
24 May 2022
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models
  of Source Code
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code
Changan Niu
Chuanyi Li
Bin Luo
Vincent Ng
SyDa
VLM
60
48
0
24 May 2022
On the Role of Bidirectionality in Language Model Pre-Training
On the Role of Bidirectionality in Language Model Pre-Training
Mikel Artetxe
Jingfei Du
Naman Goyal
Luke Zettlemoyer
Ves Stoyanov
30
16
0
24 May 2022
Semi-Parametric Inducing Point Networks and Neural Processes
Semi-Parametric Inducing Point Networks and Neural Processes
R. Rastogi
Yair Schiff
Alon Hacohen
Zhaozhi Li
I-Hsiang Lee
Yuntian Deng
M. Sabuncu
Volodymyr Kuleshov
3DPC
29
6
0
24 May 2022
On Advances in Text Generation from Images Beyond Captioning: A Case
  Study in Self-Rationalization
On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Shruti Palaskar
Akshita Bhagia
Yonatan Bisk
Florian Metze
A. Black
Ana Marasović
33
4
0
24 May 2022
Penguins Don't Fly: Reasoning about Generics through Instantiations and
  Exceptions
Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions
Emily Allaway
Jena D. Hwang
Chandra Bhagavatula
Kathleen McKeown
Doug Downey
Yejin Choi
LRM
44
20
0
23 May 2022
Towards Opening the Black Box of Neural Machine Translation: Source and
  Target Interpretations of the Transformer
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
Javier Ferrando
Gerard I. Gállego
Belen Alastruey
Carlos Escolano
Marta R. Costa-jussá
35
44
0
23 May 2022
BolT: Fused Window Transformers for fMRI Time Series Analysis
BolT: Fused Window Transformers for fMRI Time Series Analysis
H. Bedel
Irmak Sivgin
Onat Dalmaz
S. Dar
Tolga Çukur
64
54
0
23 May 2022
Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual
  Style Transfer with Small Language Models
Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models
Mirac Suzgun
Luke Melas-Kyriazi
Dan Jurafsky
VLM
87
66
0
23 May 2022
VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks
  for Visual Question Answering
VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering
Yanan Wang
Michihiro Yasunaga
Hongyu Ren
Shinya Wada
J. Leskovec
29
17
0
23 May 2022
Photorealistic Text-to-Image Diffusion Models with Deep Language
  Understanding
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
Chitwan Saharia
William Chan
Saurabh Saxena
Lala Li
Jay Whang
...
Raphael Gontijo-Lopes
Tim Salimans
Jonathan Ho
David J Fleet
Mohammad Norouzi
VLM
113
5,817
0
23 May 2022
Memory-enriched computation and learning in spiking neural networks
  through Hebbian plasticity
Memory-enriched computation and learning in spiking neural networks through Hebbian plasticity
Thomas Limbacher
Ozan Özdenizci
Robert Legenstein
21
2
0
23 May 2022
BBTv2: Towards a Gradient-Free Future with Large Language Models
BBTv2: Towards a Gradient-Free Future with Large Language Models
Tianxiang Sun
Zhengfu He
Hong Qian
Yunhua Zhou
Xuanjing Huang
Xipeng Qiu
108
53
0
23 May 2022
PEVL: Position-enhanced Pre-training and Prompt Tuning for
  Vision-language Models
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
Yuan Yao
Qi-An Chen
Ao Zhang
Wei Ji
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
VLM
MLLM
29
38
0
23 May 2022
TempLM: Distilling Language Models into Template-Based Generators
TempLM: Distilling Language Models into Template-Based Generators
Tianyi Zhang
Mina Lee
Lisa Li
Ende Shen
Tatsunori B. Hashimoto
VLM
45
5
0
23 May 2022
muNet: Evolving Pretrained Deep Neural Networks into Scalable
  Auto-tuning Multitask Systems
muNet: Evolving Pretrained Deep Neural Networks into Scalable Auto-tuning Multitask Systems
Andrea Gesmundo
J. Dean
38
19
0
22 May 2022
What Do Compressed Multilingual Machine Translation Models Forget?
What Do Compressed Multilingual Machine Translation Models Forget?
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
AI4CE
46
9
0
22 May 2022
Chain of Thought Imitation with Procedure Cloning
Chain of Thought Imitation with Procedure Cloning
Mengjiao Yang
Dale Schuurmans
Pieter Abbeel
Ofir Nachum
OffRL
35
30
0
22 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of
  Large Language Models
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
34
187
0
22 May 2022
Life after BERT: What do Other Muppets Understand about Language?
Life after BERT: What do Other Muppets Understand about Language?
Vladislav Lialin
Kevin Zhao
Namrata Shivagunde
Anna Rumshisky
49
6
0
21 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSL
AI4TS
137
354
0
21 May 2022
Least-to-Most Prompting Enables Complex Reasoning in Large Language
  Models
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
Denny Zhou
Nathanael Scharli
Le Hou
Jason W. Wei
Nathan Scales
...
Dale Schuurmans
Claire Cui
Olivier Bousquet
Quoc Le
Ed H. Chi
RALM
LRM
AI4CE
27
1,061
0
21 May 2022
Few-Shot Natural Language Inference Generation with PDD: Prompt and
  Dynamic Demonstration
Few-Shot Natural Language Inference Generation with PDD: Prompt and Dynamic Demonstration
Kaijian Li
Shansan Gong
Kenny Q. Zhu
32
0
0
21 May 2022
Multilingual Normalization of Temporal Expressions with Masked Language
  Models
Multilingual Normalization of Temporal Expressions with Masked Language Models
Lukas Lange
Jannik Strötgen
Heike Adel
Dietrich Klakow
37
6
0
20 May 2022
Prototypical Calibration for Few-shot Learning of Language Models
Prototypical Calibration for Few-shot Learning of Language Models
Zhixiong Han
Y. Hao
Li Dong
Yutao Sun
Furu Wei
178
54
0
20 May 2022
Visually-Augmented Language Modeling
Visually-Augmented Language Modeling
Weizhi Wang
Li Dong
Hao Cheng
Haoyu Song
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
VLM
41
18
0
20 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision
  Transformers with Locality
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
119
73
0
20 May 2022
Can Foundation Models Wrangle Your Data?
Can Foundation Models Wrangle Your Data?
A. Narayan
Ines Chami
Laurel J. Orr
Simran Arora
Christopher Ré
LMTD
AI4CE
181
214
0
20 May 2022
Fidyll: A Compiler for Cross-Format Data Stories & Explorable
  Explanations
Fidyll: A Compiler for Cross-Format Data Stories & Explorable Explanations
Matthew Conlen
Jeffrey Heer
LMTD
37
6
0
19 May 2022
MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and
  Interpolation
MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation
Vikram S. Voleti
Alexia Jolicoeur-Martineau
Christopher Pal
DiffM
VGen
20
291
0
19 May 2022
Previous
123...200201202...232233234
Next