ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,349 papers shown
Title
Diffusion Models: A Comprehensive Survey of Methods and Applications
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang
Zhilong Zhang
Yingxia Shao
Shenda Hong
Runsheng Xu
Yue Zhao
Wentao Zhang
Tengjiao Wang
Ming-Hsuan Yang
DiffMMedIm
485
1,420
0
02 Sep 2022
Recurrent Convolutional Neural Networks Learn Succinct Learning
  Algorithms
Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
Surbhi Goel
Sham Kakade
Adam Tauman Kalai
Cyril Zhang
81
1
0
01 Sep 2022
In conversation with Artificial Intelligence: aligning language models
  with human values
In conversation with Artificial Intelligence: aligning language models with human values
Atoosa Kasirzadeh
Iason Gabriel
126
105
0
01 Sep 2022
SkeletonMAE: Spatial-Temporal Masked Autoencoders for Self-supervised
  Skeleton Action Recognition
SkeletonMAE: Spatial-Temporal Masked Autoencoders for Self-supervised Skeleton Action Recognition
Wenhan Wu
Yilei Hua
Ce Zheng
Shi-Bao Wu
Chong Chen
Aidong Lu
ViT
137
36
0
01 Sep 2022
Visual Prompting via Image Inpainting
Visual Prompting via Image Inpainting
Amir Bar
Yossi Gandelsman
Trevor Darrell
Amir Globerson
Alexei A. Efros
VLMVPVLM
75
212
0
01 Sep 2022
Why Do Neural Language Models Still Need Commonsense Knowledge to Handle
  Semantic Variations in Question Answering?
Why Do Neural Language Models Still Need Commonsense Knowledge to Handle Semantic Variations in Question Answering?
Sunjae Kwon
Cheongwoong Kang
Jiyeon Han
Jaesik Choi
59
0
0
01 Sep 2022
Transformers are Sample-Efficient World Models
Transformers are Sample-Efficient World Models
Vincent Micheli
Eloi Alonso
Franccois Fleuret
VLMOffRL
185
189
0
01 Sep 2022
DramatVis Personae: Visual Text Analytics for Identifying Social Biases
  in Creative Writing
DramatVis Personae: Visual Text Analytics for Identifying Social Biases in Creative Writing
Md. Naimul Hoque
Bhavya Ghai
Niklas Elmqvist
63
18
0
01 Sep 2022
Generating Coherent Drum Accompaniment With Fills And Improvisations
Generating Coherent Drum Accompaniment With Fills And Improvisations
Rishabh A. Dahale
Vaibhav Talwadker
Preeti Rao
Prateek Verma
59
3
0
01 Sep 2022
Deep Sparse Conformer for Speech Recognition
Deep Sparse Conformer for Speech Recognition
Xianchao Wu
41
2
0
01 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
154
114
0
31 Aug 2022
Negative Human Rights as a Basis for Long-term AI Safety and Regulation
Negative Human Rights as a Basis for Long-term AI Safety and Regulation
Ondrej Bajgar
Jan Horenovsky
FaML
70
9
0
31 Aug 2022
LINKS: A dataset of a hundred million planar linkage mechanisms for
  data-driven kinematic design
LINKS: A dataset of a hundred million planar linkage mechanisms for data-driven kinematic design
Amin Heyrani Nobari
Akash Srivastava
Dan Gutfreund
Faez Ahmed
3DVPINNAI4CE
49
11
0
30 Aug 2022
Annotated Dataset Creation through General Purpose Language Models for
  non-English Medical NLP
Annotated Dataset Creation through General Purpose Language Models for non-English Medical NLP
Johann Frei
Frank Kramer
62
2
0
30 Aug 2022
Flexible Job Classification with Zero-Shot Learning
Flexible Job Classification with Zero-Shot Learning
Thom Lake
VLM
105
1
0
30 Aug 2022
AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100
  Labels
AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels
Nicholas Roberts
Xintong Li
Tzu-Heng Huang
Dyah Adila
Spencer Schoenberg
Chengao Liu
Lauren Pick
Haotian Ma
Aws Albarghouthi
Frederic Sala
UQCV
117
9
0
30 Aug 2022
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural
  Network Quantization
ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization
Cong Guo
Chen Zhang
Jingwen Leng
Zihan Liu
Fan Yang
Yun-Bo Liu
Minyi Guo
Yuhao Zhu
MQ
83
60
0
30 Aug 2022
Towards Boosting the Open-Domain Chatbot with Human Feedback
Towards Boosting the Open-Domain Chatbot with Human Feedback
Hua Lu
Siqi Bao
H. He
Fan Wang
Hua Wu
Haifeng Wang
ALM
69
19
0
30 Aug 2022
Transformers with Learnable Activation Functions
Transformers with Learnable Activation Functions
Haishuo Fang
Ji-Ung Lee
N. Moosavi
Iryna Gurevych
AI4CE
46
8
0
30 Aug 2022
Super-model ecosystem: A domain-adaptation perspective
Super-model ecosystem: A domain-adaptation perspective
Fengxiang He
Dacheng Tao
DiffM
84
1
0
30 Aug 2022
The Alignment Problem from a Deep Learning Perspective
The Alignment Problem from a Deep Learning Perspective
Richard Ngo
Lawrence Chan
Sören Mindermann
139
192
0
30 Aug 2022
SB-SSL: Slice-Based Self-Supervised Transformers for Knee Abnormality
  Classification from MRI
SB-SSL: Slice-Based Self-Supervised Transformers for Knee Abnormality Classification from MRI
Sara Atito
Syed Muhammad Anwar
Muhammad Awais
Josef Kitler
ViTMedIm
56
12
0
29 Aug 2022
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis
Wanshu Fan
Yen-Chun Chen
Dongdong Chen
Yu Cheng
Lu Yuan
Yu-Chiang Frank Wang
DiffM
92
97
0
29 Aug 2022
On Grounded Planning for Embodied Tasks with Language Models
On Grounded Planning for Embodied Tasks with Language Models
Bill Yuchen Lin
Chengsong Huang
Qian Liu
Wenda Gu
Sam Sommerer
Xiang Ren
LM&Ro
114
41
0
29 Aug 2022
Efficient Vision-Language Pretraining with Visual Concepts and
  Hierarchical Alignment
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment
Mustafa Shukor
Guillaume Couairon
Matthieu Cord
VLMCLIP
100
27
0
29 Aug 2022
LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems
LogicRank: Logic Induced Reranking for Generative Text-to-Image Systems
Bjorn Deiseroth
P. Schramowski
Hikaru Shindo
Devendra Singh Dhami
Kristian Kersting
EGVMDiffM
49
2
0
29 Aug 2022
ClusTR: Exploring Efficient Self-attention via Clustering for Vision
  Transformers
ClusTR: Exploring Efficient Self-attention via Clustering for Vision Transformers
Yutong Xie
Jianpeng Zhang
Yong-quan Xia
Anton Van Den Hengel
Qi Wu
61
6
0
28 Aug 2022
MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages
MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages
Qingyuan Zhang
Xiaoyu Shen
Ernie Chang
Jidong Ge
Peng-Jen Chen
75
14
0
27 Aug 2022
On Unsupervised Training of Link Grammar Based Language Models
On Unsupervised Training of Link Grammar Based Language Models
N. Mikhaylovskiy
LRM
24
0
0
27 Aug 2022
What Do NLP Researchers Believe? Results of the NLP Community Metasurvey
What Do NLP Researchers Believe? Results of the NLP Community Metasurvey
Julian Michael
Ari Holtzman
Alicia Parrish
Aaron Mueller
Alex Jinpeng Wang
...
Divyam Madaan
Nikita Nangia
Richard Yuanzhe Pang
Jason Phang
Sam Bowman
71
39
0
26 Aug 2022
DiVa: An Accelerator for Differentially Private Machine Learning
DiVa: An Accelerator for Differentially Private Machine Learning
Beom-Joo Park
Ranggi Hwang
Dongho Yoon
Yoonhyuk Choi
Minsoo Rhu
53
9
0
26 Aug 2022
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image
  Pretraining
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIPVLM
113
167
0
25 Aug 2022
Shortcut Learning of Large Language Models in Natural Language
  Understanding
Shortcut Learning of Large Language Models in Natural Language Understanding
Mengnan Du
Fengxiang He
Na Zou
Dacheng Tao
Helen Zhou
KELMOffRL
134
90
0
25 Aug 2022
Rethinking Cost-sensitive Classification in Deep Learning via
  Adversarial Data Augmentation
Rethinking Cost-sensitive Classification in Deep Learning via Adversarial Data Augmentation
Qiyuan Chen
Raed Al Kontar
Maher Nouiehed
Xi Yang
Corey A. Lester
AAML
53
2
0
24 Aug 2022
PEER: A Collaborative Language Model
PEER: A Collaborative Language Model
Timo Schick
Jane Dwivedi-Yu
Zhengbao Jiang
Fabio Petroni
Patrick Lewis
Gautier Izacard
Qingfei You
Christoforos Nalmpantis
Edouard Grave
Sebastian Riedel
ALM
106
97
0
24 Aug 2022
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation
  of Story Generation
Of Human Criteria and Automatic Metrics: A Benchmark of the Evaluation of Story Generation
Cyril Chhun
Pierre Colombo
Chloé Clavel
Fabian M. Suchanek
191
55
0
24 Aug 2022
Repair Is Nearly Generation: Multilingual Program Repair with LLMs
Repair Is Nearly Generation: Multilingual Program Repair with LLMs
Harshit Joshi
J. Cambronero
Sumit Gulwani
Vu Le
Ivan Radicek
Gust Verbruggen
LRM
72
136
0
24 Aug 2022
PromptFL: Let Federated Participants Cooperatively Learn Prompts Instead
  of Models -- Federated Learning in Age of Foundation Model
PromptFL: Let Federated Participants Cooperatively Learn Prompts Instead of Models -- Federated Learning in Age of Foundation Model
Tao Guo
Song Guo
Junxiao Wang
Wenchao Xu
FedMLVLMLRM
71
127
0
24 Aug 2022
CheapET-3: Cost-Efficient Use of Remote DNN Models
CheapET-3: Cost-Efficient Use of Remote DNN Models
Michael Weiss
60
1
0
24 Aug 2022
DPTDR: Deep Prompt Tuning for Dense Passage Retrieval
DPTDR: Deep Prompt Tuning for Dense Passage Retrieval
Zhen-Quan Tang
Benyou Wang
Ting Yao
VLM
61
14
0
24 Aug 2022
FactMix: Using a Few Labeled In-domain Examples to Generalize to
  Cross-domain Named Entity Recognition
FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition
Linyi Yang
Lifan Yuan
Leyang Cui
Wen Gao
Yue Zhang
110
16
0
24 Aug 2022
Prompting as Probing: Using Language Models for Knowledge Base
  Construction
Prompting as Probing: Using Language Models for Knowledge Base Construction
Dimitrios Alivanistos
Selene Báez Santamaría
Michael Cochez
Jan-Christoph Kalo
Emile van Krieken
Thiviyan Thanapalasingam
KELM
98
48
0
23 Aug 2022
Evaluate Confidence Instead of Perplexity for Zero-shot Commonsense
  Reasoning
Evaluate Confidence Instead of Perplexity for Zero-shot Commonsense Reasoning
Letian Peng
Z. Li
Hai Zhao
ReLMLRM
45
1
0
23 Aug 2022
AI and 6G into the Metaverse: Fundamentals, Challenges and Future
  Research Trends
AI and 6G into the Metaverse: Fundamentals, Challenges and Future Research Trends
Muhammad Zawish
Fayaz Ali Dharejo
Sunder Ali Khowaja
Saleem Raza
Steven Davy
Kapal Dev
P. Bellavista
82
68
0
23 Aug 2022
Lottery Pools: Winning More by Interpolating Tickets without Increasing
  Training or Inference Cost
Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost
Lu Yin
Shiwei Liu
Fang Meng
Tianjin Huang
Vlado Menkovski
Mykola Pechenizkiy
54
13
0
23 Aug 2022
Learning More May Not Be Better: Knowledge Transferability in Vision and
  Language Tasks
Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks
Tianwei Chen
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
Hajime Nagahara
VLM
56
0
0
23 Aug 2022
The GENEA Challenge 2022: A large evaluation of data-driven co-speech
  gesture generation
The GENEA Challenge 2022: A large evaluation of data-driven co-speech gesture generation
Youngwoo Yoon
Pieter Wolfert
Taras Kucherenko
Carla Viegas
Teodor Nikolov
Mihail Tsakov
G. Henter
VGen
82
81
0
22 Aug 2022
Efficient Planning in a Compact Latent Action Space
Efficient Planning in a Compact Latent Action Space
Zhengyao Jiang
Tianjun Zhang
Michael Janner
Yueying Li
Tim Rocktaschel
Edward Grefenstette
Yuandong Tian
OffRL
89
40
0
22 Aug 2022
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model
  Adaptation
PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
VLMCLL
94
44
0
22 Aug 2022
A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function
  Perspective
A Unified Analysis of Mixed Sample Data Augmentation: A Loss Function Perspective
Chanwoo Park
Sangdoo Yun
Sanghyuk Chun
AAML
83
32
0
21 Aug 2022
Previous
123...188189190...245246247
Next