ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXivPDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 11,099 papers shown
Title
Improved Denoising Diffusion Probabilistic Models
Improved Denoising Diffusion Probabilistic Models
Alex Nichol
Prafulla Dhariwal
DiffM
60
3,526
0
18 Feb 2021
Meta-Transfer Learning for Low-Resource Abstractive Summarization
Meta-Transfer Learning for Low-Resource Abstractive Summarization
Yi-Syuan Chen
Hong-Han Shuai
CLL
OffRL
48
38
0
18 Feb 2021
Training Large-Scale News Recommenders with Pretrained Language Models
  in the Loop
Training Large-Scale News Recommenders with Pretrained Language Models in the Loop
Shitao Xiao
Zheng Liu
Yingxia Shao
Tao Di
Xing Xie
VLM
AIFin
127
41
0
18 Feb 2021
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize
  Long-Tail Visual Concepts
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts
Soravit Changpinyo
P. Sharma
Nan Ding
Radu Soricut
VLM
299
1,084
0
17 Feb 2021
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure
  Dataset Release
Preventing Unauthorized Use of Proprietary Data: Poisoning for Secure Dataset Release
Liam H. Fowl
Ping Yeh-Chiang
Micah Goldblum
Jonas Geiping
Arpit Bansal
W. Czaja
Tom Goldstein
24
43
0
16 Feb 2021
Accelerated Sparse Neural Training: A Provable and Efficient Method to
  Find N:M Transposable Masks
Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
Itay Hubara
Brian Chmiel
Moshe Island
Ron Banner
S. Naor
Daniel Soudry
59
111
0
16 Feb 2021
GradInit: Learning to Initialize Neural Networks for Stable and
  Efficient Training
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training
Chen Zhu
Renkun Ni
Zheng Xu
Kezhi Kong
Yifan Jiang
Tom Goldstein
ODL
41
53
0
16 Feb 2021
Exploring Transformers in Natural Language Generation: GPT, BERT, and
  XLNet
Exploring Transformers in Natural Language Generation: GPT, BERT, and XLNet
M. O. Topal
Anil Bas
Imke van Heerden
LLMAG
AI4CE
26
88
0
16 Feb 2021
Training Larger Networks for Deep Reinforcement Learning
Training Larger Networks for Deep Reinforcement Learning
Keita Ota
Devesh K. Jha
Asako Kanezaki
OffRL
23
39
0
16 Feb 2021
Prompt Programming for Large Language Models: Beyond the Few-Shot
  Paradigm
Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm
Laria Reynolds
Kyle McDonell
28
848
0
15 Feb 2021
Understanding Negative Samples in Instance Discriminative
  Self-supervised Representation Learning
Understanding Negative Samples in Instance Discriminative Self-supervised Representation Learning
Kento Nozawa
Issei Sato
SSL
22
43
0
13 Feb 2021
Explaining Neural Scaling Laws
Explaining Neural Scaling Laws
Yasaman Bahri
Ethan Dyer
Jared Kaplan
Jaehoon Lee
Utkarsh Sharma
27
250
0
12 Feb 2021
Proof Artifact Co-training for Theorem Proving with Language Models
Proof Artifact Co-training for Theorem Proving with Language Models
Jesse Michael Han
Jason M. Rute
Yuhuai Wu
Edward W. Ayers
Stanislas Polu
AIMat
25
120
0
11 Feb 2021
Cross-Domain Multi-Task Learning for Sequential Sentence Classification
  in Research Papers
Cross-Domain Multi-Task Learning for Sequential Sentence Classification in Research Papers
Arthur Brack
Anett Hoppe
Pascal Buschermöhle
Ralph Ewerth
27
18
0
11 Feb 2021
A Framework for Auditing Data Center Energy Usage and Mitigating
  Environmental Footprint
A Framework for Auditing Data Center Energy Usage and Mitigating Environmental Footprint
Justin Gould
12
1
0
08 Feb 2021
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch
Aojun Zhou
Yukun Ma
Junnan Zhu
Jianbo Liu
Zhijie Zhang
Kun Yuan
Wenxiu Sun
Hongsheng Li
55
240
0
08 Feb 2021
Unifying Vision-and-Language Tasks via Text Generation
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
277
525
0
04 Feb 2021
Embodied Intelligence via Learning and Evolution
Embodied Intelligence via Learning and Evolution
Agrim Gupta
Silvio Savarese
Surya Ganguli
Li Fei-Fei
AI4CE
22
230
0
03 Feb 2021
When Can Models Learn From Explanations? A Formal Framework for
  Understanding the Roles of Explanation Data
When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data
Peter Hase
Joey Tianyi Zhou
XAI
25
87
0
03 Feb 2021
Mind the Gap: Assessing Temporal Generalization in Neural Language
  Models
Mind the Gap: Assessing Temporal Generalization in Neural Language Models
Angeliki Lazaridou
A. Kuncoro
E. Gribovskaya
Devang Agrawal
Adam Liska
...
Sebastian Ruder
Dani Yogatama
Kris Cao
Susannah Young
Phil Blunsom
VLM
41
207
0
03 Feb 2021
MAUVE: Measuring the Gap Between Neural Text and Human Text using
  Divergence Frontiers
MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
Krishna Pillutla
Swabha Swayamdipta
Rowan Zellers
John Thickstun
Sean Welleck
Yejin Choi
Zaïd Harchaoui
42
343
0
02 Feb 2021
Measuring and Improving Consistency in Pretrained Language Models
Measuring and Improving Consistency in Pretrained Language Models
Yanai Elazar
Nora Kassner
Shauli Ravfogel
Abhilasha Ravichander
Eduard H. Hovy
Hinrich Schütze
Yoav Goldberg
HILM
269
346
0
01 Feb 2021
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained
  Language Models
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models
Nora Kassner
Philipp Dufter
Hinrich Schütze
21
133
0
01 Feb 2021
Decoupling the Role of Data, Attention, and Losses in Multimodal
  Transformers
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
Lisa Anne Hendricks
John F. J. Mellor
R. Schneider
Jean-Baptiste Alayrac
Aida Nematzadeh
79
110
0
31 Jan 2021
Can We Automate Scientific Reviewing?
Can We Automate Scientific Reviewing?
Weizhe Yuan
Pengfei Liu
Graham Neubig
83
84
0
30 Jan 2021
BENDR: using transformers and a contrastive self-supervised learning
  task to learn from massive amounts of EEG data
BENDR: using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data
Demetres Kostas
Stephane Aroca-Ouellette
Frank Rudzicz
SSL
46
202
0
28 Jan 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
290
980
0
27 Jan 2021
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
TT-Rec: Tensor Train Compression for Deep Learning Recommendation Models
Chunxing Yin
Bilge Acun
Xing Liu
Carole-Jean Wu
50
102
0
25 Jan 2021
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
150
675
0
24 Jan 2021
Word Alignment by Fine-tuning Embeddings on Parallel Corpora
Word Alignment by Fine-tuning Embeddings on Parallel Corpora
Zi-Yi Dou
Graham Neubig
96
257
0
20 Jan 2021
Towards Facilitating Empathic Conversations in Online Mental Health
  Support: A Reinforcement Learning Approach
Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach
Ashish Sharma
Inna Wanyin Lin
Adam S. Miner
David C. Atkins
Tim Althoff
AI4MH
25
139
0
19 Jan 2021
Diagnostic Captioning: A Survey
Diagnostic Captioning: A Survey
John Pavlopoulos
Vasiliki Kougia
Ion Androutsopoulos
D. Papamichail
3DV
MedIm
91
26
0
18 Jan 2021
ZeRO-Offload: Democratizing Billion-Scale Model Training
ZeRO-Offload: Democratizing Billion-Scale Model Training
Jie Ren
Samyam Rajbhandari
Reza Yazdani Aminabadi
Olatunji Ruwase
Shuangyang Yang
Minjia Zhang
Dong Li
Yuxiong He
MoE
177
416
0
18 Jan 2021
Counterfactual Generative Networks
Counterfactual Generative Networks
Axel Sauer
Andreas Geiger
OOD
BDL
CML
41
123
0
15 Jan 2021
GAN Inversion: A Survey
GAN Inversion: A Survey
Weihao Xia
Yulun Zhang
Yujiu Yang
Jing-Hao Xue
Bolei Zhou
Ming-Hsuan Yang
DiffM
65
507
0
14 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural
  Networks
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
Rong Jin
39
3
0
12 Jan 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
11
2,080
0
11 Jan 2021
Investigating the Vision Transformer Model for Image Retrieval Tasks
Investigating the Vision Transformer Model for Image Retrieval Tasks
S. Gkelios
Y. Boutalis
S. Chatzichristofis
VLM
ViT
26
30
0
11 Jan 2021
Learning quantum data with the quantum Earth Mover's distance
Learning quantum data with the quantum Earth Mover's distance
B. Kiani
Giacomo De Palma
M. Marvian
Zi-Wen Liu
S. Lloyd
21
45
0
08 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
107
341
0
05 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,431
0
04 Jan 2021
Retrieving and Reading: A Comprehensive Survey on Open-domain Question
  Answering
Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering
Fengbin Zhu
Wenqiang Lei
Chao Wang
Jianming Zheng
Soujanya Poria
Tat-Seng Chua
RALM
213
252
0
04 Jan 2021
Few-Shot Question Answering by Pretraining Span Selection
Few-Shot Question Answering by Pretraining Span Selection
Ori Ram
Yuval Kirstain
Jonathan Berant
Amir Globerson
Omer Levy
36
97
0
02 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
16
15
0
02 Jan 2021
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Xiang Lisa Li
Percy Liang
20
4,088
0
01 Jan 2021
WARP: Word-level Adversarial ReProgramming
WARP: Word-level Adversarial ReProgramming
Karen Hambardzumyan
Hrant Khachatrian
Jonathan May
AAML
254
342
0
01 Jan 2021
Studying Strategically: Learning to Mask for Closed-book QA
Studying Strategically: Learning to Mask for Closed-book QA
Qinyuan Ye
Belinda Z. Li
Sinong Wang
Benjamin Bolte
Hao Ma
Wen-tau Yih
Xiang Ren
Madian Khabsa
OffRL
24
11
0
31 Dec 2020
Shortformer: Better Language Modeling using Shorter Inputs
Shortformer: Better Language Modeling using Shorter Inputs
Ofir Press
Noah A. Smith
M. Lewis
230
89
0
31 Dec 2020
Making Pre-trained Language Models Better Few-shot Learners
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,924
0
31 Dec 2020
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots
  Matters
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters
Mengjie Zhao
Yi Zhu
Ehsan Shareghi
Ivan Vulić
Roi Reichart
Anna Korhonen
Hinrich Schütze
32
64
0
31 Dec 2020
Previous
123...217218219220221222
Next