ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,362 papers shown
Title
Few-Shot Learning of Compact Models via Task-Specific Meta Distillation
Few-Shot Learning of Compact Models via Task-Specific Meta Distillation
Yong Wu
Shekhor Chanda
M. Hosseinzadeh
Zhi Liu
Yang Wang
VLM
105
8
0
18 Oct 2022
Controllable Fake Document Infilling for Cyber Deception
Controllable Fake Document Infilling for Cyber Deception
Yibo Hu
Yu Lin
Eric Parolin
Latif Khan
Kevin W. Hamlen
66
8
0
18 Oct 2022
Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches
  and Future Directions
Taxonomy of Abstractive Dialogue Summarization: Scenarios, Approaches and Future Directions
Qi Jia
Yizhu Liu
Siyu Ren
Kenny Q. Zhu
80
8
0
18 Oct 2022
Transfer learning with affine model transformation
Transfer learning with affine model transformation
Shunya Minami
Kenji Fukumizu
Yoshihiro Hayashi
Ryo Yoshida
69
1
0
18 Oct 2022
DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for
  Controllable Text Generation
DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation
Hanqing Zhang
Dawei Song
89
37
0
18 Oct 2022
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models
Zhiyuan Zhang
Lingjuan Lyu
Xingjun Ma
Chenguang Wang
Xu Sun
AAML
64
43
0
18 Oct 2022
Systematicity in GPT-3's Interpretation of Novel English Noun Compounds
Systematicity in GPT-3's Interpretation of Novel English Noun Compounds
Siyan Li
Riley Carlson
Christopher Potts
AI4CE
52
15
0
18 Oct 2022
Using Bottleneck Adapters to Identify Cancer in Clinical Notes under
  Low-Resource Constraints
Using Bottleneck Adapters to Identify Cancer in Clinical Notes under Low-Resource Constraints
Omid Rohanian
Hannah Jauncey
Mohammadmahdi Nouriborji
Vinod Kumar Chauhan
Bronner P. Gonccalves
Christiana Kartsonaki
Isaric Clinical Characterisation Group
L. Merson
David Clifton
59
7
0
17 Oct 2022
Deepfake Text Detection: Limitations and Opportunities
Deepfake Text Detection: Limitations and Opportunities
Jiameng Pu
Zain Sarwar
Sifat Muhammad Abdullah
A. Rehman
Yoonjin Kim
P. Bhattacharya
M. Javed
Bimal Viswanath
AAML
68
57
0
17 Oct 2022
Deep Bidirectional Language-Knowledge Graph Pretraining
Deep Bidirectional Language-Knowledge Graph Pretraining
Michihiro Yasunaga
Antoine Bosselut
Hongyu Ren
Xikun Zhang
Christopher D. Manning
Percy Liang
J. Leskovec
101
204
0
17 Oct 2022
Non-Contrastive Learning Meets Language-Image Pre-Training
Non-Contrastive Learning Meets Language-Image Pre-Training
Jinghao Zhou
Li Dong
Zhe Gan
Lijuan Wang
Furu Wei
VLMCLIP
75
26
0
17 Oct 2022
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun
Nathan Scales
Nathanael Scharli
Sebastian Gehrmann
Yi Tay
...
Aakanksha Chowdhery
Quoc V. Le
Ed H. Chi
Denny Zhou
Jason W. Wei
ALMELMLRMReLM
283
1,143
0
17 Oct 2022
Table-To-Text generation and pre-training with TabT5
Table-To-Text generation and pre-training with TabT5
Ewa Andrejczuk
Julian Martin Eisenschlos
Francesco Piccinno
Syrine Krichene
Yasemin Altun
LMTD
66
31
0
17 Oct 2022
Prompting GPT-3 To Be Reliable
Prompting GPT-3 To Be Reliable
Chenglei Si
Zhe Gan
Zhengyuan Yang
Shuohang Wang
Jianfeng Wang
Jordan L. Boyd-Graber
Lijuan Wang
KELMLRM
113
303
0
17 Oct 2022
Flipped Classroom: Effective Teaching for Time Series Forecasting
Flipped Classroom: Effective Teaching for Time Series Forecasting
P. Teutsch
Patrick Mäder
AI4TS
67
8
0
17 Oct 2022
Approximating Continuous Convolutions for Deep Network Compression
Approximating Continuous Convolutions for Deep Network Compression
Theo W. Costain
V. Prisacariu
66
0
0
17 Oct 2022
Meta-Learning via Classifier(-free) Diffusion Guidance
Meta-Learning via Classifier(-free) Diffusion Guidance
Elvis Nava
Seijin Kobayashi
Yifei Yin
Robert K. Katzschmann
Benjamin Grewe
VLM
71
6
0
17 Oct 2022
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models
  with Zero Training
Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training
A. M. H. Tiong
Junnan Li
Boyang Albert Li
Silvio Savarese
Guosheng Lin
MLLM
131
109
0
17 Oct 2022
Keep Me Updated! Memory Management in Long-term Conversations
Keep Me Updated! Memory Management in Long-term Conversations
Sanghwan Bae
Donghyun Kwak
Soyoung Kang
Min Young Lee
Sungdong Kim
Yuin Jeong
Hyeri Kim
Sang-Woo Lee
W. Park
Nako Sung
116
51
0
17 Oct 2022
RARR: Researching and Revising What Language Models Say, Using Language
  Models
RARR: Researching and Revising What Language Models Say, Using Language Models
Luyu Gao
Zhuyun Dai
Panupong Pasupat
Anthony Chen
Arun Tejasvi Chaganty
...
Vincent Zhao
Ni Lao
Hongrae Lee
Da-Cheng Juan
Kelvin Guu
HILMKELM
133
260
0
17 Oct 2022
Continuous Pseudo-Labeling from the Start
Continuous Pseudo-Labeling from the Start
Dan Berrebbi
R. Collobert
Samy Bengio
Navdeep Jaitly
Tatiana Likhomanenko
65
16
0
17 Oct 2022
Teacher Forcing Recovers Reward Functions for Text Generation
Teacher Forcing Recovers Reward Functions for Text Generation
Yongchang Hao
Yuxin Liu
Lili Mou
OffRL
91
12
0
17 Oct 2022
Review learning: Real world validation of privacy preserving continual learning across medical institutions
Review learning: Real world validation of privacy preserving continual learning across medical institutions
Jaesung Yoo
Sung-Hyuk Choi
Yewon Yang
Suhyeon Kim
J. Choi
...
H. J. Joo
Dae-Jung Kim
R. Park
Hyeong-Jin Yoon
Kwangsoo Kim
OffRLKELM
86
0
0
17 Oct 2022
Accelerating Transfer Learning with Near-Data Computation on Cloud
  Object Stores
Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores
Arsany Guirguis
Diana Petrescu
Florin Dinu
D. Quoc
Javier Picorel
R. Guerraoui
71
0
0
16 Oct 2022
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of
  Self-Supervised Speech Representation Learning
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning
Tzu-hsun Feng
Annie Dong
Ching-Feng Yeh
Shu-Wen Yang
Tzu-Quan Lin
...
Xuankai Chang
Shinji Watanabe
Abdel-rahman Mohamed
Shang-Wen Li
Hung-yi Lee
ELMSSL
102
35
0
16 Oct 2022
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations
  On-the-Fly
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi R. Fung
Tuhin Chakraborty
Hao Guo
Owen Rambow
Smaranda Muresan
Heng Ji
83
42
0
16 Oct 2022
Zero-Shot Learners for Natural Language Understanding via a Unified
  Multiple Choice Perspective
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
Ping Yang
Junjie Wang
Ruyi Gan
Xinyu Zhu
Lin Zhang
Ziwei Wu
Xinyu Gao
Jiaxing Zhang
Tetsuya Sakai
BDL
71
26
0
16 Oct 2022
Knowledge Prompting in Pre-trained Language Model for Natural Language
  Understanding
Knowledge Prompting in Pre-trained Language Model for Natural Language Understanding
Jiadong Wang
Wenkang Huang
Qiuhui Shi
Hongbin Wang
Minghui Qiu
Xiang Li
Ming Gao
KELMVLM
90
19
0
16 Oct 2022
Model Criticism for Long-Form Text Generation
Model Criticism for Long-Form Text Generation
Yuntian Deng
Volodymyr Kuleshov
Alexander M. Rush
110
19
0
16 Oct 2022
LAION-5B: An open large-scale dataset for training next generation
  image-text models
LAION-5B: An open large-scale dataset for training next generation image-text models
Christoph Schuhmann
Romain Beaumont
Richard Vencu
Cade Gordon
Ross Wightman
...
Srivatsa Kundurthy
Katherine Crowson
Ludwig Schmidt
R. Kaczmarczyk
J. Jitsev
VLMMLLMCLIP
231
3,520
0
16 Oct 2022
Temporal Word Meaning Disambiguation using TimeLMs
Temporal Word Meaning Disambiguation using TimeLMs
Mihir Godbole
Parth Dandavate
Aditya Kane
74
2
0
15 Oct 2022
TestAug: A Framework for Augmenting Capability-based NLP Tests
TestAug: A Framework for Augmenting Capability-based NLP Tests
Guanqun Yang
Mirazul Haque
Qiaochu Song
Wei Yang
Xueqing Liu
ELM
61
0
0
14 Oct 2022
Injecting Domain Knowledge from Empirical Interatomic Potentials to
  Neural Networks for Predicting Material Properties
Injecting Domain Knowledge from Empirical Interatomic Potentials to Neural Networks for Predicting Material Properties
Zeren Shui
Daniel S. Karls
Mingjian Wen
Ilia Nikiforov
E. Tadmor
George Karypis
72
8
0
14 Oct 2022
PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base
  Population
PseudoReasoner: Leveraging Pseudo Labels for Commonsense Knowledge Base Population
Tianqing Fang
Quyet V. Do
Hongming Zhang
Yangqiu Song
Ginny Wong
Simon See
LRM
104
11
0
14 Oct 2022
The Debate Over Understanding in AI's Large Language Models
The Debate Over Understanding in AI's Large Language Models
Melanie Mitchell
D. Krakauer
ELM
155
222
0
14 Oct 2022
Sequential Learning Of Neural Networks for Prequential MDL
Sequential Learning Of Neural Networks for Prequential MDL
J. Bornschein
Yazhe Li
Marcus Hutter
AI4TS
55
7
0
14 Oct 2022
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural
  Networks
Free Fine-tuning: A Plug-and-Play Watermarking Scheme for Deep Neural Networks
Run Wang
Jixing Ren
Boheng Li
Tianyi She
Wenhui Zhang
Liming Fang
Jing Chen
Chao Shen
Lina Wang
WIGM
79
19
0
14 Oct 2022
Robust Preference Learning for Storytelling via Contrastive
  Reinforcement Learning
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning
Louis Castricato
Alexander Havrilla
Shahbuland Matiana
Michael Pieler
Anbang Ye
Ian Yang
Spencer Frazier
Mark O. Riedl
95
13
0
14 Oct 2022
Extracting Cultural Commonsense Knowledge at Scale
Extracting Cultural Commonsense Knowledge at Scale
Shrestha Ghosh
Simon Razniewski
A. Varde
Gerhard Weikum
106
66
0
14 Oct 2022
Language Generation Models Can Cause Harm: So What Can We Do About It?
  An Actionable Survey
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
185
91
0
14 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Dianbo Sui
3DV
195
9
0
14 Oct 2022
Abstract-to-Executable Trajectory Translation for One-Shot Task
  Generalization
Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization
Stone Tao
Xiaochen Li
Tongzhou Mu
Zhiao Huang
Yuzhe Qin
Hao Su
68
3
0
14 Oct 2022
Enabling Classifiers to Make Judgements Explicitly Aligned with Human
  Values
Enabling Classifiers to Make Judgements Explicitly Aligned with Human Values
Yejin Bang
Tiezheng Yu
Andrea Madotto
Zhaojiang Lin
Mona T. Diab
Pascale Fung
74
13
0
14 Oct 2022
A Survey of Parameters Associated with the Quality of Benchmarks in NLP
A Survey of Parameters Associated with the Quality of Benchmarks in NLP
Swaroop Mishra
Anjana Arunkumar
Chris Bryan
Chitta Baral
105
1
0
14 Oct 2022
DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic
  Search-Free Low-Rank Adaptation
DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation
Mojtaba Valipour
Mehdi Rezagholizadeh
I. Kobyzev
A. Ghodsi
149
184
0
14 Oct 2022
Self-Adaptive Named Entity Recognition by Retrieving Unstructured
  Knowledge
Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge
Kosuke Nishida
Naoki Yoshinaga
Kyosuke Nishida
86
2
0
14 Oct 2022
Can Language Representation Models Think in Bets?
Can Language Representation Models Think in Bets?
Zhi–Bin Tang
Mayank Kejriwal
51
6
0
14 Oct 2022
"John is 50 years old, can his son be 65?" Evaluating NLP Models'
  Understanding of Feasibility
"John is 50 years old, can his son be 65?" Evaluating NLP Models' Understanding of Feasibility
Himanshu Gupta
Neeraj Varshney
Swaroop Mishra
Kuntal Kumar Pal
Saurabh Arjun Sawant
Kevin Scaria
Siddharth Goyal
Chitta Baral
ELM
100
14
0
14 Oct 2022
Transparency Helps Reveal When Language Models Learn Meaning
Transparency Helps Reveal When Language Models Learn Meaning
Zhaofeng Wu
William Merrill
Hao Peng
Iz Beltagy
Noah A. Smith
59
10
0
14 Oct 2022
Mind the Labels: Describing Relations in Knowledge Graphs With
  Pretrained Models
Mind the Labels: Describing Relations in Knowledge Graphs With Pretrained Models
Zdeněk Kasner
Ioannis Konstas
Ondrej Dusek
78
6
0
13 Oct 2022
Previous
123...181182183...246247248
Next