ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXivPDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 11,078 papers shown
Title
Bootleg: Chasing the Tail with Self-Supervised Named Entity
  Disambiguation
Bootleg: Chasing the Tail with Self-Supervised Named Entity Disambiguation
Laurel J. Orr
Megan Leszczynski
Simran Arora
Sen Wu
Neel Guha
Xiao Ling
Christopher Ré
143
48
0
20 Oct 2020
Local Knowledge Powered Conversational Agents
Local Knowledge Powered Conversational Agents
Sashank Santhanam
Ming-Yu Liu
Raul Puri
M. Shoeybi
M. Patwary
Bryan Catanzaro
29
4
0
20 Oct 2020
Neural Language Modeling for Contextualized Temporal Graph Generation
Neural Language Modeling for Contextualized Temporal Graph Generation
Aman Madaan
Yiming Yang
38
20
0
20 Oct 2020
Optimism in the Face of Adversity: Understanding and Improving Deep
  Learning through Adversarial Robustness
Optimism in the Face of Adversity: Understanding and Improving Deep Learning through Adversarial Robustness
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
AAML
29
48
0
19 Oct 2020
Consistency and Coherency Enhanced Story Generation
Consistency and Coherency Enhanced Story Generation
Wei Wang
Piji Li
Haitao Zheng
30
11
0
17 Oct 2020
For self-supervised learning, Rationality implies generalization,
  provably
For self-supervised learning, Rationality implies generalization, provably
Yamini Bansal
Gal Kaplun
Boaz Barak
OOD
SSL
58
22
0
16 Oct 2020
Masked Contrastive Representation Learning for Reinforcement Learning
Masked Contrastive Representation Learning for Reinforcement Learning
Jinhua Zhu
Yingce Xia
Lijun Wu
Jiajun Deng
Wen-gang Zhou
Tao Qin
Houqiang Li
SSL
OffRL
34
55
0
15 Oct 2020
Neural Databases
Neural Databases
James Thorne
Majid Yazdani
Marzieh Saeidi
Fabrizio Silvestri
Sebastian Riedel
A. Halevy
NAI
34
9
0
14 Oct 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
242
611
0
13 Oct 2020
MixCo: Mix-up Contrastive Learning for Visual Representation
MixCo: Mix-up Contrastive Learning for Visual Representation
Sungnyun Kim
Gihun Lee
Sangmin Bae
Seyoung Yun
SSL
112
80
0
13 Oct 2020
Improving Text Generation with Student-Forcing Optimal Transport
Improving Text Generation with Student-Forcing Optimal Transport
Guoyin Wang
Chunyuan Li
Jianqiao Li
Hao Fu
Yuh-Chen Lin
...
Ruiyi Zhang
Wenlin Wang
Dinghan Shen
Qian Yang
Lawrence Carin
OT
30
17
0
12 Oct 2020
Neural, Symbolic and Neural-Symbolic Reasoning on Knowledge Graphs
Neural, Symbolic and Neural-Symbolic Reasoning on Knowledge Graphs
Jing Zhang
Bo Chen
Lingxi Zhang
Xirui Ke
Haipeng Ding
NAI
35
3
0
12 Oct 2020
SMYRF: Efficient Attention using Asymmetric Clustering
SMYRF: Efficient Attention using Asymmetric Clustering
Giannis Daras
Nikita Kitaev
Augustus Odena
A. Dimakis
31
44
0
11 Oct 2020
A ground-truth dataset and classification model for detecting bots in
  GitHub issue and PR comments
A ground-truth dataset and classification model for detecting bots in GitHub issue and PR comments
M. Golzadeh
Alexandre Decan
Damien Legay
T. Mens
29
73
0
07 Oct 2020
Representation Learning for Sequence Data with Deep Autoencoding
  Predictive Components
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Junwen Bai
Weiran Wang
Yingbo Zhou
Caiming Xiong
SSL
AI4TS
27
12
0
07 Oct 2020
InfoBERT: Improving Robustness of Language Models from An Information
  Theoretic Perspective
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective
Wei Ping
Shuohang Wang
Yu Cheng
Zhe Gan
R. Jia
Bo-wen Li
Jingjing Liu
AAML
46
113
0
05 Oct 2020
Local Label Point Correction for Edge Detection of Overlapping Cervical
  Cells
Local Label Point Correction for Edge Detection of Overlapping Cervical Cells
Jiawei Liu
Huijie Fan
Qiang Wang
Wentao Li
Yandong Tang
Danbo Wang
Mingyi Zhou
Li Chen
13
9
0
05 Oct 2020
PMI-Masking: Principled masking of correlated spans
PMI-Masking: Principled masking of correlated spans
Yoav Levine
Barak Lenz
Opher Lieber
Omri Abend
Kevin Leyton-Brown
Moshe Tennenholtz
Y. Shoham
22
72
0
05 Oct 2020
Effective Unsupervised Domain Adaptation with Adversarially Trained
  Language Models
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models
Thuy-Trang Vu
Dinh Q. Phung
Gholamreza Haffari
16
24
0
05 Oct 2020
Data-Efficient Pretraining via Contrastive Self-Supervision
Data-Efficient Pretraining via Contrastive Self-Supervision
Nils Rethmeier
Isabelle Augenstein
23
20
0
02 Oct 2020
Where Does Trust Break Down? A Quantitative Trust Analysis of Deep
  Neural Networks via Trust Matrix and Conditional Trust Densities
Where Does Trust Break Down? A Quantitative Trust Analysis of Deep Neural Networks via Trust Matrix and Conditional Trust Densities
Andrew Hryniowski
Xiao Yu Wang
A. Wong
17
10
0
30 Sep 2020
Understanding Human Intelligence through Human Limitations
Understanding Human Intelligence through Human Limitations
Thomas L. Griffiths
28
64
0
29 Sep 2020
Utility is in the Eye of the User: A Critique of NLP Leaderboards
Utility is in the Eye of the User: A Critique of NLP Leaderboards
Kawin Ethayarajh
Dan Jurafsky
ELM
24
51
0
29 Sep 2020
From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction
  Using Social Media Data
From Twitter to Traffic Predictor: Next-Day Morning Traffic Prediction Using Social Media Data
Weiran Yao
Sean Qian
16
47
0
29 Sep 2020
Pchatbot: A Large-Scale Dataset for Personalized Chatbot
Pchatbot: A Large-Scale Dataset for Personalized Chatbot
Hongjin Qian
Xiaohe Li
Hanxun Zhong
Yu Guo
Yueyuan Ma
Yutao Zhu
Zhanliang Liu
Zhanliang Liu
Ji-Rong Wen
41
43
0
28 Sep 2020
KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense
  Reasoning
KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning
Ye Liu
Yao Wan
Lifang He
Hao Peng
Philip S. Yu
32
188
0
26 Sep 2020
Machine Knowledge: Creation and Curation of Comprehensive Knowledge
  Bases
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
Gerhard Weikum
Luna Dong
Simon Razniewski
Fabian M. Suchanek
30
125
0
24 Sep 2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
  Models
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
37
1,130
0
24 Sep 2020
Controlling Style in Generated Dialogue
Controlling Style in Generated Dialogue
Eric Michael Smith
Diana Gonzalez-Rico
Emily Dinan
Y-Lan Boureau
AI4CE
36
50
0
22 Sep 2020
VirtualFlow: Decoupling Deep Learning Models from the Underlying
  Hardware
VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware
Andrew Or
Haoyu Zhang
M. Freedman
17
9
0
20 Sep 2020
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning
  in NLP Using Fewer Parameters & Less Data
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data
Jonathan Pilault
Amine Elhattami
C. Pal
CLL
MoE
30
89
0
19 Sep 2020
Generation-Augmented Retrieval for Open-domain Question Answering
Generation-Augmented Retrieval for Open-domain Question Answering
Yuning Mao
Pengcheng He
Xiaodong Liu
Yelong Shen
Jianfeng Gao
Jiawei Han
Weizhu Chen
RALM
42
238
0
17 Sep 2020
Self-Supervised Meta-Learning for Few-Shot Natural Language
  Classification Tasks
Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks
Trapit Bansal
Rishikesh Jha
Tsendsuren Munkhdalai
Andrew McCallum
SSL
VLM
22
87
0
17 Sep 2020
Review: Deep Learning in Electron Microscopy
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
34
79
0
17 Sep 2020
Efficient Transformer-based Large Scale Language Representations using
  Hardware-friendly Block Structured Pruning
Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning
Bingbing Li
Zhenglun Kong
Tianyun Zhang
Ji Li
Zechao Li
Hang Liu
Caiwen Ding
VLM
32
64
0
17 Sep 2020
Generative Language-Grounded Policy in Vision-and-Language Navigation
  with Bayes' Rule
Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule
Shuhei Kurita
Kyunghyun Cho
LM&Ro
17
23
0
16 Sep 2020
Automated Source Code Generation and Auto-completion Using Deep
  Learning: Comparing and Discussing Current Language-Model-Related Approaches
Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches
Juan Cruz-Benito
Sanjay Vishwakarma
Francisco Martín-Fernández
Ismael Faro Ibm Quantum
22
30
0
16 Sep 2020
Evaluating representations by the complexity of learning low-loss
  predictors
Evaluating representations by the complexity of learning low-loss predictors
William F. Whitney
M. Song
David Brandfonbrener
Jaan Altosaar
Kyunghyun Cho
25
23
0
15 Sep 2020
Augmented Natural Language for Generative Sequence Labeling
Augmented Natural Language for Generative Sequence Labeling
Ben Athiwaratkun
Cicero Nogueira dos Santos
Jason Krone
Bing Xiang
VLM
14
61
0
15 Sep 2020
Critical Thinking for Language Models
Critical Thinking for Language Models
Gregor Betz
Christian Voigt
Kyle Richardson
SyDa
ReLM
LRM
AI4CE
23
35
0
15 Sep 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot
  Learners
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
51
955
0
15 Sep 2020
MLMLM: Link Prediction with Mean Likelihood Masked Language Model
MLMLM: Link Prediction with Mean Likelihood Masked Language Model
Louis Clouâtre
P. Trempe
Amal Zouaq
Sarath Chandar
25
43
0
15 Sep 2020
The Radicalization Risks of GPT-3 and Advanced Neural Language Models
The Radicalization Risks of GPT-3 and Advanced Neural Language Models
Kris McGuffie
Alex Newhouse
14
149
0
15 Sep 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
114
1,102
0
14 Sep 2020
GeDi: Generative Discriminator Guided Sequence Generation
GeDi: Generative Discriminator Guided Sequence Generation
Ben Krause
Akhilesh Deepak Gotmare
Bryan McCann
N. Keskar
Chenyu You
R. Socher
Nazneen Rajani
56
389
0
14 Sep 2020
The Hardware Lottery
The Hardware Lottery
Sara Hooker
27
203
0
14 Sep 2020
Abstract Neural Networks
Abstract Neural Networks
Matthew Sotoudeh
Aditya V. Thakur
8
19
0
11 Sep 2020
The Intriguing Relation Between Counterfactual Explanations and
  Adversarial Examples
The Intriguing Relation Between Counterfactual Explanations and Adversarial Examples
Timo Freiesleben
GAN
41
62
0
11 Sep 2020
SoK: Certified Robustness for Deep Neural Networks
SoK: Certified Robustness for Deep Neural Networks
Linyi Li
Tao Xie
Bo-wen Li
AAML
33
128
0
09 Sep 2020
Adversarial Watermarking Transformer: Towards Tracing Text Provenance
  with Data Hiding
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
Sahar Abdelnabi
Mario Fritz
WaLM
26
89
0
07 Sep 2020
Previous
123...219220221222
Next