ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,337 papers shown
Title
Transformers as Meta-Learners for Implicit Neural Representations
Transformers as Meta-Learners for Implicit Neural Representations
Yinbo Chen
Xiaolong Wang
AI4CE
99
68
0
04 Aug 2022
MVSFormer: Multi-View Stereo by Learning Robust Image Features and
  Temperature-based Depth
MVSFormer: Multi-View Stereo by Learning Robust Image Features and Temperature-based Depth
Chenjie Cao
Xinlin Ren
Yanwei Fu
108
54
0
04 Aug 2022
Prompt Tuning for Generative Multimodal Pretrained Models
Prompt Tuning for Generative Multimodal Pretrained Models
Han Yang
Junyang Lin
An Yang
Peng Wang
Chang Zhou
Hongxia Yang
VLMLRMVPVLM
86
31
0
04 Aug 2022
Fusing Sentence Embeddings Into LSTM-based Autoregressive Language
  Models
Fusing Sentence Embeddings Into LSTM-based Autoregressive Language Models
Vilém Zouhar
Marius Mosbach
Dietrich Klakow
57
1
0
04 Aug 2022
Learning Prior Feature and Attention Enhanced Image Inpainting
Learning Prior Feature and Attention Enhanced Image Inpainting
Chenjie Cao
Qiaole Dong
Yanwei Fu
DiffM
78
26
0
03 Aug 2022
Two-Stream Transformer Architecture for Long Video Understanding
Two-Stream Transformer Architecture for Long Video Understanding
Edward Fish
Jon Weinbren
Andrew Gilbert
ViT
52
6
0
02 Aug 2022
The Curse of Low Task Diversity: On the Failure of Transfer Learning to
  Outperform MAML and Their Empirical Equivalence
The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence
Brando Miranda
P. Yu
Yu-Xiong Wang
Oluwasanmi Koyejo
85
10
0
02 Aug 2022
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq
  Model
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Saleh Soltan
Shankar Ananthakrishnan
Jack G. M. FitzGerald
Rahul Gupta
Wael Hamza
...
Mukund Sridhar
Fabian Triefenbach
Apurv Verma
Gokhan Tur
Premkumar Natarajan
129
83
0
02 Aug 2022
To Answer or Not to Answer? Improving Machine Reading Comprehension
  Model with Span-based Contrastive Learning
To Answer or Not to Answer? Improving Machine Reading Comprehension Model with Span-based Contrastive Learning
Yunjie Ji
Liangyu Chen
Chenxiao Dou
Baochang Ma
Xiangang Li
82
5
0
02 Aug 2022
Exploring the GLIDE model for Human Action-effect Prediction
Exploring the GLIDE model for Human Action-effect Prediction
Fangjun Li
David C. Hogg
Anthony G. Cohn
50
0
0
01 Aug 2022
What Can Transformers Learn In-Context? A Case Study of Simple Function
  Classes
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
Shivam Garg
Dimitris Tsipras
Percy Liang
Gregory Valiant
158
514
0
01 Aug 2022
SMART: Sentences as Basic Units for Text Evaluation
SMART: Sentences as Basic Units for Text Evaluation
Reinald Kim Amplayo
Peter J. Liu
Yao-Min Zhao
Shashi Narayan
79
22
0
01 Aug 2022
Learning from flowsheets: A generative transformer model for
  autocompletion of flowsheets
Learning from flowsheets: A generative transformer model for autocompletion of flowsheets
Gabriel Vogel
Lukas Schulze Balhorn
Artur M. Schweidtmann
AI4CE
99
36
0
01 Aug 2022
Efficient Long-Text Understanding with Short-Text Models
Efficient Long-Text Understanding with Short-Text Models
Maor Ivgi
Uri Shaham
Jonathan Berant
VLM
128
84
0
01 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention
  and Its Linearization
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
127
9
0
01 Aug 2022
Augmenting Vision Language Pretraining by Learning Codebook with Visual
  Semantics
Augmenting Vision Language Pretraining by Learning Codebook with Visual Semantics
Xiaoyuan Guo
Jiali Duan
C.-C. Jay Kuo
J. Gichoya
Imon Banerjee
VLM
46
1
0
31 Jul 2022
Smoothing Entailment Graphs with Language Models
Smoothing Entailment Graphs with Language Models
Nick McKenna
Tianyi Li
Mark Johnson
Mark Steedman
64
11
0
30 Jul 2022
Point Primitive Transformer for Long-Term 4D Point Cloud Video
  Understanding
Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding
Hao Wen
Yunze Liu
Jingwei Huang
Bokun Duan
Li Yi
ViT3DPC
87
28
0
30 Jul 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision
  and Beyond
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
96
77
0
30 Jul 2022
Language Models Can Teach Themselves to Program Better
Language Models Can Teach Themselves to Program Better
Patrick M. Haluptzok
Matthew Bowers
Adam Tauman Kalai
ReLMSyDaLRM
125
82
0
29 Jul 2022
LAD: Language Models as Data for Zero-Shot Dialog
LAD: Language Models as Data for Zero-Shot Dialog
Shikib Mehri
Yasemin Altun
M. Eskénazi
70
26
0
28 Jul 2022
Large Language Models and the Reverse Turing Test
Large Language Models and the Reverse Turing Test
T. Sejnowski
ELM
148
114
0
28 Jul 2022
Pro-tuning: Unified Prompt Tuning for Vision Tasks
Pro-tuning: Unified Prompt Tuning for Vision Tasks
Xing Nie
Bolin Ni
Jianlong Chang
Gaomeng Meng
Chunlei Huo
Zhaoxiang Zhang
Shiming Xiang
Qi Tian
Chunhong Pan
AAMLVPVLMVLM
122
76
0
28 Jul 2022
Efficient Training of Language Models to Fill in the Middle
Efficient Training of Language Models to Fill in the Middle
Mohammad Bavarian
Heewoo Jun
Nikolas Tezak
John Schulman
C. McLeavey
Jerry Tworek
Mark Chen
91
197
0
28 Jul 2022
Sequence to sequence pretraining for a less-resourced Slovenian language
Sequence to sequence pretraining for a less-resourced Slovenian language
Matej Ulčar
Marko Robnik-Šikonja
AIMat
68
17
0
28 Jul 2022
HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein
  Language Model as an Alternative
HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative
Xiaomin Fang
Fan Wang
Lihang Liu
Jingzhou He
Dayong Lin
Yingfei Xiang
Xiaonan Zhang
Hua Wu
Hui Li
Le Song
65
54
0
28 Jul 2022
Explain My Surprise: Learning Efficient Long-Term Memory by Predicting
  Uncertain Outcomes
Explain My Surprise: Learning Efficient Long-Term Memory by Predicting Uncertain Outcomes
A. Sorokin
N. Buzun
Leonid Pugachev
Andrey Kravchenko
167
8
0
27 Jul 2022
RealTime QA: What's the Answer Right Now?
RealTime QA: What's the Answer Right Now?
Jungo Kasai
Keisuke Sakaguchi
Yoichi Takahashi
Ronan Le Bras
Akari Asai
Xinyan Velocity Yu
Dragomir R. Radev
Noah A. Smith
Yejin Choi
Kentaro Inui
KELM
162
194
0
27 Jul 2022
Is Attention All That NeRF Needs?
Is Attention All That NeRF Needs?
T. MukundVarma
Peihao Wang
Xuxi Chen
Tianlong Chen
Subhashini Venugopalan
Zhangyang Wang
ViT
129
113
0
27 Jul 2022
Contextual Information and Commonsense Based Prompt for Emotion
  Recognition in Conversation
Contextual Information and Commonsense Based Prompt for Emotion Recognition in Conversation
Jingjie Yi
Deqing Yang
Siyu Yuan
Caiyan Cao
Zhiyao Zhang
Yanghua Xiao
61
9
0
27 Jul 2022
Toward Transparent AI: A Survey on Interpreting the Inner Structures of
  Deep Neural Networks
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks
Tilman Raukur
A. Ho
Stephen Casper
Dylan Hadfield-Menell
AAMLAI4CE
128
134
0
27 Jul 2022
Retrieval-Augmented Transformer for Image Captioning
Retrieval-Augmented Transformer for Image Captioning
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
88
59
0
26 Jul 2022
NNSmith: Generating Diverse and Valid Test Cases for Deep Learning
  Compilers
NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers
Jiawei Liu
Jinkun Lin
Fabian Ruffy
Cheng Tan
Jinyang Li
Aurojit Panda
Lingming Zhang
158
75
0
26 Jul 2022
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented
  Diffusion Models
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Robin Rombach
A. Blattmann
Bjorn Ommer
DiffM
83
71
0
26 Jul 2022
LaKo: Knowledge-driven Visual Question Answering via Late
  Knowledge-to-Text Injection
LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection
Zhuo Chen
Yufen Huang
Jiaoyan Chen
Yuxia Geng
Yin Fang
Jeff Z. Pan
Ningyu Zhang
Wen Zhang
95
38
0
26 Jul 2022
A Hazard Analysis Framework for Code Synthesis Large Language Models
A Hazard Analysis Framework for Code Synthesis Large Language Models
Heidy Khlaaf
Pamela Mishkin
Joshua Achiam
Gretchen Krueger
Miles Brundage
ELM
74
29
0
25 Jul 2022
Self-Distilled Vision Transformer for Domain Generalization
Self-Distilled Vision Transformer for Domain Generalization
M. Sultana
Muzammal Naseer
Muhammad Haris Khan
Salman Khan
Fahad Shahbaz Khan
ViT
76
31
0
25 Jul 2022
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug
  Labeling to Enhance Product-Specific Guidance Assessment
Fine-Tuning BERT for Automatic ADME Semantic Labeling in FDA Drug Labeling to Enhance Product-Specific Guidance Assessment
Yiwen Shi
Jing Wang
Ping Ren
Taha ValizadehAslani
Yi Zhang
Meng Hu
Hualou Liang
AI4MHAAML
71
17
0
25 Jul 2022
Is GPT-3 all you need for Visual Question Answering in Cultural
  Heritage?
Is GPT-3 all you need for Visual Question Answering in Cultural Heritage?
P. Bongini
Federico Becattini
A. Bimbo
44
13
0
25 Jul 2022
Intention-Conditioned Long-Term Human Egocentric Action Forecasting
Intention-Conditioned Long-Term Human Egocentric Action Forecasting
Esteve Valls Mascaro
Hyemin Ahn
Dongheui Lee
EgoV
100
31
0
25 Jul 2022
Neural Generation Meets Real People: Building a Social, Informative
  Open-Domain Dialogue Agent
Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent
Ethan A. Chi
Ashwin Paranjape
A. See
Caleb Chiam
Trenton Chang
...
Dilara Soylu
Jillian Tang
A. Narayan
Giovanni Campagna
Christopher D. Manning
87
7
0
25 Jul 2022
No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code
  Intelligence
No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence
Chaozheng Wang
Yuanhang Yang
Cuiyun Gao
Yun Peng
Hongyu Zhang
Michael R. Lyu
AAML
115
144
0
24 Jul 2022
High-Resolution Swin Transformer for Automatic Medical Image
  Segmentation
High-Resolution Swin Transformer for Automatic Medical Image Segmentation
Chen Wei
Shenghan Ren
Kaitai Guo
Haihong Hu
Jimin Liang
ViTOODMedIm
57
43
0
23 Jul 2022
Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to
  Protect Privacy of Individuals on Twitter
Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to Protect Privacy of Individuals on Twitter
Dilara Doğan
Bahadir Altun
Muhammed Said Zengin
Mucahid Kutlu
Tamer Elsayed
58
3
0
23 Jul 2022
PanGu-Coder: Program Synthesis with Function-Level Language Modeling
PanGu-Coder: Program Synthesis with Function-Level Language Modeling
Fenia Christopoulou
Gerasimos Lampouras
Milan Gritta
Guchun Zhang
Yinpeng Guo
...
Guangtai Liang
Jia Wei
Xin Jiang
Qianxiang Wang
Qun Liu
ELMSyDaALM
109
76
0
22 Jul 2022
Applying Spatiotemporal Attention to Identify Distracted and Drowsy
  Driving with Vision Transformers
Applying Spatiotemporal Attention to Identify Distracted and Drowsy Driving with Vision Transformers
Samay Lakhani
ViTMedIm
43
2
0
22 Jul 2022
Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for
  Specialized Cyber Threat Intelligence
Multi-Level Fine-Tuning, Data Augmentation, and Few-Shot Learning for Specialized Cyber Threat Intelligence
Markus Bayer
Tobias Frey
Christian A. Reuter
AAML
61
17
0
22 Jul 2022
BigIssue: A Realistic Bug Localization Benchmark
BigIssue: A Realistic Bug Localization Benchmark
Paul Kassianik
Erik Nijkamp
Bo Pang
Yingbo Zhou
Caiming Xiong
42
0
0
21 Jul 2022
Towards Efficient Adversarial Training on Vision Transformers
Towards Efficient Adversarial Training on Vision Transformers
Boxi Wu
Jindong Gu
Zhifeng Li
Deng Cai
Xiaofei He
Wei Liu
ViTAAML
94
40
0
21 Jul 2022
CodeT: Code Generation with Generated Tests
CodeT: Code Generation with Generated Tests
Bei Chen
Fengji Zhang
A. Nguyen
Daoguang Zan
Zeqi Lin
Jian-Guang Lou
Weizhu Chen
105
349
0
21 Jul 2022
Previous
123...190191192...245246247
Next