ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,277 papers shown
Title
Natural Language Sentence Generation from API Specifications
Natural Language Sentence Generation from API Specifications
Siyu Huo
K. Mukherjee
Jayachandu Bandlamudi
Vatche Isahagian
Vinod Muthusamy
Sadhana Kumaravel
59
2
0
01 Jun 2022
Dynaformer: A Deep Learning Model for Ageing-aware Battery Discharge
  Prediction
Dynaformer: A Deep Learning Model for Ageing-aware Battery Discharge Prediction
Luca Biggio
Tommaso Bendinelli
Chetan S. Kulkarni
Olga Fink
61
11
0
01 Jun 2022
Romantic-Computing
Romantic-Computing
Elizabeth Horishny
79
0
0
01 Jun 2022
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder
DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder
Jie Shi
Chenfei Wu
Jian Liang
Xiang Liu
Nan Duan
DiffM
76
26
0
01 Jun 2022
Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Tianyu Chen
Shaohan Huang
Yuan Xie
Binxing Jiao
Daxin Jiang
Haoyi Zhou
Jianxin Li
Furu Wei
MoE
88
42
0
01 Jun 2022
THE-X: Privacy-Preserving Transformer Inference with Homomorphic
  Encryption
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption
Tianyu Chen
Hangbo Bao
Shaohan Huang
Li Dong
Binxing Jiao
Daxin Jiang
Haoyi Zhou
Jianxin Li
Furu Wei
98
107
0
01 Jun 2022
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Byzantine-Robust Online and Offline Distributed Reinforcement Learning
Yiding Chen
Xuezhou Zhang
Kai Zhang
Mengdi Wang
Xiaojin Zhu
OffRL
133
18
0
01 Jun 2022
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Efficient Self-supervised Vision Pretraining with Local Masked Reconstruction
Jun Chen
Ming Hu
Boyang Albert Li
Mohamed Elhoseiny
142
37
0
01 Jun 2022
Pre-training via Denoising for Molecular Property Prediction
Pre-training via Denoising for Molecular Property Prediction
Sheheryar Zaidi
Michael Schaarschmidt
James Martens
Hyunjik Kim
Yee Whye Teh
Alvaro Sanchez-Gonzalez
Peter W. Battaglia
Razvan Pascanu
Jonathan Godwin
DiffMAI4CE
129
127
0
31 May 2022
Asynchronous Hierarchical Federated Learning
Asynchronous Hierarchical Federated Learning
Xing Wang
Yijun Wang
FedML
73
6
0
31 May 2022
Neural Retriever and Go Beyond: A Thesis Proposal
Neural Retriever and Go Beyond: A Thesis Proposal
Man Luo
100
1
0
31 May 2022
On the Usefulness of Embeddings, Clusters and Strings for Text Generator
  Evaluation
On the Usefulness of Embeddings, Clusters and Strings for Text Generator Evaluation
Tiago Pimentel
Clara Meister
Ryan Cotterell
117
7
0
31 May 2022
You Can't Count on Luck: Why Decision Transformers and RvS Fail in
  Stochastic Environments
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments
Keiran Paster
Sheila A. McIlraith
Jimmy Ba
OffRL
262
28
0
31 May 2022
Knowledge Graph - Deep Learning: A Case Study in Question Answering in
  Aviation Safety Domain
Knowledge Graph - Deep Learning: A Case Study in Question Answering in Aviation Safety Domain
Ankush Agarwal
Raju Gite
Shreyash Laddha
P. Bhattacharyya
Satyanarayan Kar
Asif Ekbal
Prabhjit Thind
Rajesh Zele
Ravi Shankar
122
14
0
31 May 2022
VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series
  Forecasting
VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting
Kashif Rasul
Young-Jin Park
Max Nihlén Ramström
KyungHyun Kim
BDLAI4TS
38
4
0
31 May 2022
Leveraging Pre-Trained Language Models to Streamline Natural Language
  Interaction for Self-Tracking
Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking
Young-Ho Kim
Sungdong Kim
Minsuk Chang
Sang-Woo Lee
87
5
0
31 May 2022
FinBERT-MRC: financial named entity recognition using BERT under the
  machine reading comprehension paradigm
FinBERT-MRC: financial named entity recognition using BERT under the machine reading comprehension paradigm
Yuzhe Zhang
Hong Zhang
85
28
0
31 May 2022
Few-Shot Diffusion Models
Few-Shot Diffusion Models
Giorgio Giannone
Didrik Nielsen
Ole Winther
DiffM
231
51
0
30 May 2022
Attention Flows for General Transformers
Attention Flows for General Transformers
Niklas Metzger
Christopher Hahn
Julian Siber
Frederik Schmitt
Bernd Finkbeiner
67
0
0
30 May 2022
Zero-Shot and Few-Shot Learning for Lung Cancer Multi-Label
  Classification using Vision Transformer
Zero-Shot and Few-Shot Learning for Lung Cancer Multi-Label Classification using Vision Transformer
F. Guo
Yingfang Fan
ViTMedIm
124
7
0
30 May 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGeVLM
97
13
0
30 May 2022
A Survey in Mathematical Language Processing
A Survey in Mathematical Language Processing
Jordan Meadows
André Freitas
AIMat
63
16
0
30 May 2022
Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained
  Models
Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models
Mengzhou Xia
Mikel Artetxe
Jingfei Du
Danqi Chen
Ves Stoyanov
49
6
0
30 May 2022
Automatic Short Math Answer Grading via In-context Meta-learning
Automatic Short Math Answer Grading via In-context Meta-learning
Mengxue Zhang
Sami Baral
Neil T. Heffernan
Andrew Lan
AI4EdAIMat
69
25
0
30 May 2022
Flowification: Everything is a Normalizing Flow
Flowification: Everything is a Normalizing Flow
Bálint Máté
Samuel Klein
T. Golling
Franccois Fleuret
64
3
0
30 May 2022
Superposing Many Tickets into One: A Performance Booster for Sparse
  Neural Network Training
Superposing Many Tickets into One: A Performance Booster for Sparse Neural Network Training
Lu Yin
Vlado Menkovski
Meng Fang
Tianjin Huang
Yulong Pei
Mykola Pechenizkiy
Decebal Constantin Mocanu
Shiwei Liu
108
8
0
30 May 2022
Billions of Parameters Are Worth More Than In-domain Training Data: A
  case study in the Legal Case Entailment Task
Billions of Parameters Are Worth More Than In-domain Training Data: A case study in the Legal Case Entailment Task
G. Rosa
L. Bonifacio
Vitor Jeronymo
Hugo Queiroz Abonizio
R. Lotufo
Rodrigo Nogueira
AILawELM
100
11
0
30 May 2022
A Transistor Operations Model for Deep Learning Energy Consumption
  Scaling Law
A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law
Chen Li
Antonios Tsourdos
Weisi Guo
AI4CE
59
3
0
30 May 2022
Dataset Condensation via Efficient Synthetic-Data Parameterization
Dataset Condensation via Efficient Synthetic-Data Parameterization
Jang-Hyun Kim
Jinuk Kim
Seong Joon Oh
Sangdoo Yun
Hwanjun Song
Joonhyun Jeong
Jung-Woo Ha
Hyun Oh Song
DD
518
168
0
30 May 2022
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Muning Wen
J. Kuba
Runji Lin
Weinan Zhang
Ying Wen
Jun Wang
Yaodong Yang
103
192
0
30 May 2022
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language
  Understanding and Generation
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation
Qihuang Zhong
Liang Ding
Juhua Liu
Bo Du
Dacheng Tao
105
27
0
30 May 2022
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing
  Mechanisms in Sequence Learning
Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning
Aniket Didolkar
Kshitij Gupta
Anirudh Goyal
Nitesh B. Gundavarapu
Alex Lamb
Nan Rosemary Ke
Yoshua Bengio
AI4CE
200
18
0
30 May 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
322
632
0
29 May 2022
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt
  Learning
Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning
Xiang Chen
Lei Li
Ningyu Zhang
Xiaozhuan Liang
Shumin Deng
Chuanqi Tan
Fei Huang
Luo Si
Huajun Chen
VLM
153
55
0
29 May 2022
The impact of memory on learning sequence-to-sequence tasks
The impact of memory on learning sequence-to-sequence tasks
Alireza Seif
S. Loos
Gennaro Tucci
É. Roldán
Sebastian Goldt
70
5
0
29 May 2022
COFS: Controllable Furniture layout Synthesis
COFS: Controllable Furniture layout Synthesis
W. Para
Paul Guerrero
Niloy Mitra
Peter Wonka
3DV
88
17
0
29 May 2022
Improving VAE-based Representation Learning
Improving VAE-based Representation Learning
Mingtian Zhang
Tim Z. Xiao
Brooks Paige
David Barber
SSLDRL
70
10
0
28 May 2022
All That's Happening behind the Scenes: Putting the Spotlight on
  Volunteer Moderator Labor in Reddit
All That's Happening behind the Scenes: Putting the Spotlight on Volunteer Moderator Labor in Reddit
Hanlin Li
Brent J. Hecht
Stevie Chancellor
57
42
0
28 May 2022
Parameter-Efficient and Student-Friendly Knowledge Distillation
Parameter-Efficient and Student-Friendly Knowledge Distillation
Jun Rao
Xv Meng
Liang Ding
Shuhan Qi
Dacheng Tao
93
51
0
28 May 2022
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud
  Pre-training
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Renrui Zhang
Ziyu Guo
Rongyao Fang
Bingyan Zhao
Dong Wang
Yu Qiao
Hongsheng Li
Peng Gao
3DPC
258
261
0
28 May 2022
Gating Dropout: Communication-efficient Regularization for Sparsely
  Activated Transformers
Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers
R. Liu
Young Jin Kim
Alexandre Muzio
Hany Awadalla
MoE
79
22
0
28 May 2022
Teaching Models to Express Their Uncertainty in Words
Teaching Models to Express Their Uncertainty in Words
Stephanie C. Lin
Jacob Hilton
Owain Evans
OOD
107
425
0
28 May 2022
Few-shot Subgoal Planning with Language Models
Few-shot Subgoal Planning with Language Models
Lajanugen Logeswaran
Yao Fu
Moontae Lee
Honglak Lee
LRM
76
26
0
28 May 2022
Controllable Text Generation with Neurally-Decomposed Oracle
Controllable Text Generation with Neurally-Decomposed Oracle
Tao Meng
Sidi Lu
Nanyun Peng
Kai-Wei Chang
BDL
103
37
0
27 May 2022
Diffusion-LM Improves Controllable Text Generation
Diffusion-LM Improves Controllable Text Generation
Xiang Lisa Li
John Thickstun
Ishaan Gulrajani
Percy Liang
Tatsunori B. Hashimoto
AI4CE
248
837
0
27 May 2022
Multimodal Masked Autoencoders Learn Transferable Representations
Multimodal Masked Autoencoders Learn Transferable Representations
Xinyang Geng
Hao Liu
Lisa Lee
Dale Schuurams
Sergey Levine
Pieter Abbeel
88
119
0
27 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
276
2,297
0
27 May 2022
GIT: A Generative Image-to-text Transformer for Vision and Language
GIT: A Generative Image-to-text Transformer for Vision and Language
Jianfeng Wang
Zhengyuan Yang
Xiaowei Hu
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Zicheng Liu
Ce Liu
Lijuan Wang
VLM
172
562
0
27 May 2022
AANG: Automating Auxiliary Learning
AANG: Automating Auxiliary Learning
Lucio Dery
Paul Michel
M. Khodak
Graham Neubig
Ameet Talwalkar
107
9
0
27 May 2022
Simple Unsupervised Object-Centric Learning for Complex and Naturalistic
  Videos
Simple Unsupervised Object-Centric Learning for Complex and Naturalistic Videos
Gautam Singh
Yi-Fu Wu
Sungjin Ahn
OCL
147
121
0
27 May 2022
Previous
123...195196197...244245246
Next