ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXivPDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 11,099 papers shown
Title
8-bit Optimizers via Block-wise Quantization
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
34
273
0
06 Oct 2021
Pretraining & Reinforcement Learning: Sharpening the Axe Before Cutting
  the Tree
Pretraining & Reinforcement Learning: Sharpening the Axe Before Cutting the Tree
Saurav Kadavath
Samuel Paradis
Brian Yao
VLM
CLIP
OffRL
OnRL
22
1
0
06 Oct 2021
Exploring the Limits of Large Scale Pre-training
Exploring the Limits of Large Scale Pre-training
Samira Abnar
Mostafa Dehghani
Behnam Neyshabur
Hanie Sedghi
AI4CE
60
114
0
05 Oct 2021
Deep Neural Networks and Tabular Data: A Survey
Deep Neural Networks and Tabular Data: A Survey
V. Borisov
Tobias Leemann
Kathrin Seßler
Johannes Haug
Martin Pawelczyk
Gjergji Kasneci
LMTD
45
648
0
05 Oct 2021
Data Augmentation Approaches in Natural Language Processing: A Survey
Data Augmentation Approaches in Natural Language Processing: A Survey
Bohan Li
Yutai Hou
Wanxiang Che
130
271
0
05 Oct 2021
A Survey On Neural Word Embeddings
A Survey On Neural Word Embeddings
Erhan Sezerer
Selma Tekir
AI4TS
26
12
0
05 Oct 2021
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
MoEfication: Transformer Feed-forward Layers are Mixtures of Experts
Zhengyan Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
MoE
29
117
0
05 Oct 2021
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining
  Large Language Model Prompts
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts
Tongshuang Wu
Michael Terry
Carrie J. Cai
LLMAG
AI4CE
LRM
37
447
0
04 Oct 2021
Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and
  Closed Book QA
Perhaps PTLMs Should Go to School -- A Task to Assess Open Book and Closed Book QA
Manuel R. Ciosici
Joe Cecil
Alex Hedges
Dong-Ho Lee
Marjorie Freedman
R. Weischedel
30
9
0
04 Oct 2021
Trustworthy AI: From Principles to Practices
Trustworthy AI: From Principles to Practices
Bo-wen Li
Peng Qi
Bo Liu
Shuai Di
Jingen Liu
Jiquan Pei
Jinfeng Yi
Bowen Zhou
119
356
0
04 Oct 2021
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
Ilias Chalkidis
Abhik Jana
D. Hartung
M. Bommarito
Ion Androutsopoulos
Daniel Martin Katz
Nikolaos Aletras
AILaw
ELM
130
249
0
03 Oct 2021
Collecting and Characterizing Natural Language Utterances for Specifying
  Data Visualizations
Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations
Arjun Srinivasan
Nikhila Nyapathy
Bongshin Lee
Steven Drucker
J. Stasko
64
69
0
01 Oct 2021
Powerpropagation: A sparsity inducing weight reparameterisation
Powerpropagation: A sparsity inducing weight reparameterisation
Jonathan Richard Schwarz
Siddhant M. Jayakumar
Razvan Pascanu
P. Latham
Yee Whye Teh
92
54
0
01 Oct 2021
UserIdentifier: Implicit User Representations for Simple and Effective
  Personalized Sentiment Analysis
UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis
Fatemehsadat Mireshghallah
Vaishnavi Shrivastava
Milad Shokouhi
Taylor Berg-Kirkpatrick
Robert Sim
Dimitrios Dimitriadis
FedML
51
33
0
01 Oct 2021
Reinforcement Learning with Information-Theoretic Actuation
Reinforcement Learning with Information-Theoretic Actuation
Elliot Catt
Marcus Hutter
J. Veness
45
0
0
30 Sep 2021
A Review of Text Style Transfer using Deep Learning
A Review of Text Style Transfer using Deep Learning
Martina Toshevska
Sonja Gievska
CLIP
48
43
0
30 Sep 2021
Towards Efficient Post-training Quantization of Pre-trained Language
  Models
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
82
47
0
30 Sep 2021
Structural Persistence in Language Models: Priming as a Window into
  Abstract Language Representations
Structural Persistence in Language Models: Priming as a Window into Abstract Language Representations
Arabella J. Sinclair
Jaap Jumelet
Willem H. Zuidema
Raquel Fernández
61
38
0
30 Sep 2021
Introducing the DOME Activation Functions
Introducing the DOME Activation Functions
Mohamed E. Hussein
Wael AbdAlmageed
30
1
0
30 Sep 2021
Collaborative Storytelling with Human Actors and AI Narrators
Collaborative Storytelling with Human Actors and AI Narrators
Boyd Branch
Piotr Wojciech Mirowski
Kory W. Mathewson
29
21
0
29 Sep 2021
MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence
  using Federated Evaluation
MedPerf: Open Benchmarking Platform for Medical Artificial Intelligence using Federated Evaluation
Alexandros Karargyris
Renato Umeton
Micah J. Sheller
Alejandro Aristizabal
Johnu George
...
Poonam Yadav
Michael Rosenthal
M. Loda
Jason M. Johnson
Peter Mattson
FedML
46
73
0
29 Sep 2021
Stochastic Training is Not Necessary for Generalization
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided
  MCTS Decoding
PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding
Antoine Chaffin
Vincent Claveau
Ewa Kijak
31
36
0
28 Sep 2021
Which Design Decisions in AI-enabled Mobile Applications Contribute to
  Greener AI?
Which Design Decisions in AI-enabled Mobile Applications Contribute to Greener AI?
Roger Creus Castanyer
Silverio Martínez-Fernández
Xavier Franch
43
11
0
28 Sep 2021
Template-free Prompt Tuning for Few-shot NER
Template-free Prompt Tuning for Few-shot NER
Ruotian Ma
Xin Zhou
Tao Gui
Y. Tan
Linyang Li
Qi Zhang
Xuanjing Huang
VLM
148
178
0
28 Sep 2021
TURINGBENCH: A Benchmark Environment for Turing Test in the Age of
  Neural Text Generation
TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation
Adaku Uchendu
Zeyu Ma
Thai Le
Rui Zhang
Dongwon Lee
DeLMO
31
123
0
27 Sep 2021
FQuAD2.0: French Question Answering and knowing that you know nothing
FQuAD2.0: French Question Answering and knowing that you know nothing
Quentin Heinrich
Gautier Viaud
Wacim Belblidia
11
8
0
27 Sep 2021
Automatic Generation of Word Problems for Academic Education via Natural
  Language Processing (NLP)
Automatic Generation of Word Problems for Academic Education via Natural Language Processing (NLP)
Stanley Uros Keller
30
7
0
27 Sep 2021
Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest
  Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz
  Functions
Ridgeless Interpolation with Shallow ReLU Networks in 1D1D1D is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions
Boris Hanin
MLT
38
9
0
27 Sep 2021
Understanding and Overcoming the Challenges of Efficient Transformer
  Quantization
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
How does fake news spread? Understanding pathways of disinformation
  spread through APIs
How does fake news spread? Understanding pathways of disinformation spread through APIs
Lynnette Hui Xian Ng
Araz Taeihagh
GNN
16
23
0
27 Sep 2021
Multi-Transformer: A New Neural Network-Based Architecture for
  Forecasting S&P Volatility
Multi-Transformer: A New Neural Network-Based Architecture for Forecasting S&P Volatility
Eduardo Ramos-Pérez
P. Alonso-González
J. J. Núñez-Velázquez
39
47
0
26 Sep 2021
DziriBERT: a Pre-trained Language Model for the Algerian Dialect
DziriBERT: a Pre-trained Language Model for the Algerian Dialect
Amine Abdaoui
Mohamed Berrimi
Mourad Oussalah
A. Moussaoui
32
43
0
25 Sep 2021
Scalable deeper graph neural networks for high-performance materials
  property prediction
Scalable deeper graph neural networks for high-performance materials property prediction
Sadman Sadeed Omee
Steph-Yves M. Louis
Nihang Fu
Lai Wei
Sourin Dey
Rongzhi Dong
Qinyang Li
Jianjun Hu
70
73
0
25 Sep 2021
More Than Reading Comprehension: A Survey on Datasets and Metrics of
  Textual Question Answering
More Than Reading Comprehension: A Survey on Datasets and Metrics of Textual Question Answering
Yang Bai
D. Wang
93
10
0
25 Sep 2021
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
119
107
0
24 Sep 2021
Is the Number of Trainable Parameters All That Actually Matters?
Is the Number of Trainable Parameters All That Actually Matters?
A. Chatelain
Amine Djeghri
Daniel Hesslow
Julien Launay
Iacopo Poli
51
7
0
24 Sep 2021
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
208
221
0
24 Sep 2021
Document Automation Architectures and Technologies: A Survey
Document Automation Architectures and Technologies: A Survey
Mohammad Ahmadi Achachlouei
Omkar Patil
Tarun Joshi
V. Nair
AI4CE
20
6
0
23 Sep 2021
How much human-like visual experience do current self-supervised
  learning algorithms need in order to achieve human-level object recognition?
How much human-like visual experience do current self-supervised learning algorithms need in order to achieve human-level object recognition?
Emin Orhan
OOD
44
4
0
23 Sep 2021
Recursively Summarizing Books with Human Feedback
Recursively Summarizing Books with Human Feedback
Jeff Wu
Long Ouyang
Daniel M. Ziegler
Nissan Stiennon
Ryan J. Lowe
Jan Leike
Paul Christiano
ALM
35
295
0
22 Sep 2021
Pix2seq: A Language Modeling Framework for Object Detection
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
244
344
0
22 Sep 2021
Small-Bench NLP: Benchmark for small single GPU trained models in
  Natural Language Processing
Small-Bench NLP: Benchmark for small single GPU trained models in Natural Language Processing
K. Kanakarajan
Bhuvana Kundumani
Malaikannan Sankarasubbu
ALM
MoE
11
5
0
22 Sep 2021
Scale Efficiently: Insights from Pre-training and Fine-tuning
  Transformers
Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
Yi Tay
Mostafa Dehghani
J. Rao
W. Fedus
Samira Abnar
Hyung Won Chung
Sharan Narang
Dani Yogatama
Ashish Vaswani
Donald Metzler
206
110
0
22 Sep 2021
MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep
  Reinforcement Learning
MEPG: A Minimalist Ensemble Policy Gradient Framework for Deep Reinforcement Learning
Qiang He
Yuxun Qu
Chen Gong
Xinwen Hou
OffRL
22
10
0
22 Sep 2021
Awakening Latent Grounding from Pretrained Language Models for Semantic
  Parsing
Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing
Qian Liu
Dejian Yang
Jiahui Zhang
Jiaqi Guo
Bin Zhou
Jian-Guang Lou
51
41
0
22 Sep 2021
Learning through structure: towards deep neuromorphic knowledge graph
  embeddings
Learning through structure: towards deep neuromorphic knowledge graph embeddings
Victor Caceres Chian
Marcel Hildebrandt
Thomas Runkler
Dominik Dold
GNN
16
7
0
21 Sep 2021
Knowledge Distillation with Noisy Labels for Natural Language
  Understanding
Knowledge Distillation with Noisy Labels for Natural Language Understanding
Shivendra Bhardwaj
Abbas Ghaddar
Ahmad Rashid
Khalil Bibi
Cheng-huan Li
A. Ghodsi
Philippe Langlais
Mehdi Rezagholizadeh
19
1
0
21 Sep 2021
Survey: Transformer based Video-Language Pre-training
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
72
44
0
21 Sep 2021
SoK: Machine Learning Governance
SoK: Machine Learning Governance
Varun Chandrasekaran
Hengrui Jia
Anvith Thudi
Adelin Travers
Mohammad Yaghini
Nicolas Papernot
38
16
0
20 Sep 2021
Previous
123...207208209...220221222
Next