ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,343 papers shown
Title
General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase
  Generation
General-to-Specific Transfer Labeling for Domain Adaptable Keyphrase Generation
Rui Meng
Tong Wang
Xingdi Yuan
Yingbo Zhou
Daqing He
70
6
0
20 Aug 2022
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise
Arpit Bansal
Eitan Borgnia
Hong-Min Chu
Jie S. Li
Hamid Kazemi
Furong Huang
Micah Goldblum
Jonas Geiping
Tom Goldstein
VLMDiffM
82
286
0
19 Aug 2022
Using Large Language Models to Simulate Multiple Humans and Replicate
  Human Subject Studies
Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies
Gati Aher
RosaI. Arriaga
Adam Tauman Kalai
174
405
0
18 Aug 2022
MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue
  Generation
MulZDG: Multilingual Code-Switching Framework for Zero-shot Dialogue Generation
Yongkang Liu
Shi Feng
Daling Wang
Yifei Zhang
66
8
0
18 Aug 2022
A Scalable, Interpretable, Verifiable & Differentiable Logic Gate
  Convolutional Neural Network Architecture From Truth Tables
A Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tables
Adrien Benamira
Tristan Guérand
Thomas Peyrin
Trevor Yap
Bryan Hooi
69
2
0
18 Aug 2022
See Finer, See More: Implicit Modality Alignment for Text-based Person
  Retrieval
See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Xiujun Shu
Wei Wen
Haoqian Wu
Keyun Chen
Yi-Zhe Song
Ruizhi Qiao
Bohan Ren
Xiao Wang
91
99
0
18 Aug 2022
Understanding Scaling Laws for Recommendation Models
Understanding Scaling Laws for Recommendation Models
Newsha Ardalani
Carole-Jean Wu
Zeliang Chen
Bhargav Bhushanam
Adnan Aziz
93
31
0
17 Aug 2022
HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create
  Customized Content with Models
HELP ME THINK: A Simple Prompting Strategy for Non-experts to Create Customized Content with Models
Swaroop Mishra
E. Nouri
LRM
125
27
0
17 Aug 2022
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural
  Code Generation
MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation
Federico Cassano
John Gouwar
Daniel Nguyen
S. Nguyen
Luna Phipps-Costin
...
Carolyn Jane Anderson
Molly Q. Feldman
Arjun Guha
Michael Greenberg
Abhinav Jangda
ELM
123
93
0
17 Aug 2022
Towards Open-vocabulary Scene Graph Generation with Prompt-based
  Finetuning
Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
VLM
88
53
0
17 Aug 2022
Boosting Distributed Training Performance of the Unpadded BERT Model
Boosting Distributed Training Performance of the Unpadded BERT Model
Jinle Zeng
Min Li
Zhihua Wu
Jiaqi Liu
Yuang Liu
Dianhai Yu
Yanjun Ma
67
11
0
17 Aug 2022
Constrained Few-Shot Learning: Human-Like Low Sample Complexity Learning
  and Non-Episodic Text Classification
Constrained Few-Shot Learning: Human-Like Low Sample Complexity Learning and Non-Episodic Text Classification
Jaron Mar
Jiamou Liu
84
2
0
17 Aug 2022
EGCR: Explanation Generation for Conversational Recommendation
EGCR: Explanation Generation for Conversational Recommendation
Bingbing Wen
Xiaoning Bu
Chirag Shah
54
2
0
17 Aug 2022
Investigating the Impact of Model Width and Density on Generalization in
  Presence of Label Noise
Investigating the Impact of Model Width and Density on Generalization in Presence of Label Noise
Yihao Xue
Kyle Whitecross
Baharan Mirzasoleiman
NoLa
77
1
0
17 Aug 2022
What Artificial Neural Networks Can Tell Us About Human Language
  Acquisition
What Artificial Neural Networks Can Tell Us About Human Language Acquisition
Alex Warstadt
Samuel R. Bowman
80
120
0
17 Aug 2022
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation
  with Large Language Models
Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models
Hendrik Strobelt
Albert Webson
Victor Sanh
Benjamin Hoover
Johanna Beyer
Hanspeter Pfister
Alexander M. Rush
VLM
78
141
0
16 Aug 2022
Conv-Adapter: Exploring Parameter Efficient Transfer Learning for
  ConvNets
Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets
Hao Chen
R. Tao
Han Zhang
Yidong Wang
Xiang Li
Weirong Ye
Jindong Wang
Guosheng Hu
Marios Savvides
VPVLM
129
57
0
15 Aug 2022
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
147
666
0
15 Aug 2022
Deception for Cyber Defence: Challenges and Opportunities
Deception for Cyber Defence: Challenges and Opportunities
David Liebowitz
Surya Nepal
Kristen Moore
Cody James Christopher
S. Kanhere
David D. Nguyen
Roelien C. Timmer
Michael Longland
Keerth Rathakumar
70
10
0
15 Aug 2022
Grasping Core Rules of Time Series through Pure Models
Grasping Core Rules of Time Series through Pure Models
Gedi Liu
Yifeng Jiang
Yicun Ouyang
Keyang Zhong
Yang Wang
AI4TS
88
0
0
15 Aug 2022
Explainable Artificial Intelligence for Assault Sentence Prediction in
  New Zealand
Explainable Artificial Intelligence for Assault Sentence Prediction in New Zealand
Harry Rodger
Andrew Lensen
Marcin Betkier
16
6
0
15 Aug 2022
Targeted Honeyword Generation with Language Models
Targeted Honeyword Generation with Language Models
Fang Yu
Miguel Vargas Martin
105
5
0
15 Aug 2022
Limits of an AI program for solving college math problems
Limits of an AI program for solving college math problems
E. Davis
AIMat
44
3
0
14 Aug 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViTMedIm
53
2
0
11 Aug 2022
Interactive Code Generation via Test-Driven User-Intent Formalization
Interactive Code Generation via Test-Driven User-Intent Formalization
Shuvendu K. Lahiri
Sarah Fakhoury
Aaditya Naik
Georgios Sakkas
Saikat Chakraborty
...
Piali Choudhury
Curtis von Veh
J. Inala
Chenglong Wang
Jianfeng Gao
95
64
0
11 Aug 2022
A Comprehensive Survey of Natural Language Generation Advances from the
  Perspective of Digital Deception
A Comprehensive Survey of Natural Language Generation Advances from the Perspective of Digital Deception
Keenan I. Jones
Enes ALTUNCU
V. N. Franqueira
Yi-Chia Wang
Shujun Li
DeLMO
80
3
0
11 Aug 2022
On the Pros and Cons of Momentum Encoder in Self-Supervised Visual
  Representation Learning
On the Pros and Cons of Momentum Encoder in Self-Supervised Visual Representation Learning
T. Pham
Chaoning Zhang
Axi Niu
Kang Zhang
Chang D. Yoo
78
11
0
11 Aug 2022
Safety and Performance, Why not Both? Bi-Objective Optimized Model
  Compression toward AI Software Deployment
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deployment
Jie Zhu
Leye Wang
Xiao Han
87
10
0
11 Aug 2022
Finding Reusable Machine Learning Components to Build Programming
  Language Processing Pipelines
Finding Reusable Machine Learning Components to Build Programming Language Processing Pipelines
Patrick Flynn
T. Vanderbruggen
C. Liao
Pei-Hung Lin
M. Emani
Xipeng Shen
80
4
0
11 Aug 2022
Reducing Retraining by Recycling Parameter-Efficient Prompts
Reducing Retraining by Recycling Parameter-Efficient Prompts
Brian Lester
Joshua Yurtsever
Siamak Shakeri
Noah Constant
51
12
0
10 Aug 2022
Quality Not Quantity: On the Interaction between Dataset Design and
  Robustness of CLIP
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
CLIPVLM
180
108
0
10 Aug 2022
Can Brain Signals Reveal Inner Alignment with Human Languages?
Can Brain Signals Reveal Inner Alignment with Human Languages?
William Jongwon Han
Jielin Qiu
Jiacheng Zhu
Mengdi Xu
Douglas Weber
Yue Liu
Ding Zhao
119
13
0
10 Aug 2022
CoditT5: Pretraining for Source Code and Natural Language Editing
CoditT5: Pretraining for Source Code and Natural Language Editing
Jiyang Zhang
Sheena Panthaplackel
Pengyu Nie
Junyi Jessy Li
Miloš Gligorić
KELM
93
92
0
10 Aug 2022
Generative Action Description Prompts for Skeleton-based Action
  Recognition
Generative Action Description Prompts for Skeleton-based Action Recognition
Wangmeng Xiang
Chong Li
Yuxuan Zhou
Biao Wang
Lei Zhang
96
36
0
10 Aug 2022
Limitations of Language Models in Arithmetic and Symbolic Induction
Limitations of Language Models in Arithmetic and Symbolic Induction
Jingu Qian
Hong Wang
Zekun Li
Shiyang Li
Xifeng Yan
ReLMLRM
139
76
0
09 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time
Training Overparametrized Neural Networks in Sublinear Time
Yichuan Deng
Han Hu
Zhao Song
Omri Weinstein
Danyang Zhuo
BDL
92
28
0
09 Aug 2022
A Theoretical View on Sparsely Activated Networks
A Theoretical View on Sparsely Activated Networks
Cenk Baykal
Nishanth Dikkala
Rina Panigrahy
Cyrus Rashtchian
Xin Wang
28
11
0
08 Aug 2022
Txt2Img-MHN: Remote Sensing Image Generation from Text Using Modern
  Hopfield Networks
Txt2Img-MHN: Remote Sensing Image Generation from Text Using Modern Hopfield Networks
Yonghao Xu
Weikang Yu
Pedram Ghamisi
Michael K Kopp
Sepp Hochreiter
66
34
0
08 Aug 2022
Controlled Sparsity via Constrained Optimization or: How I Learned to
  Stop Tuning Penalties and Love Constraints
Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints
Jose Gallego-Posada
Juan Ramirez
Akram Erraqabi
Yoshua Bengio
Simon Lacoste-Julien
152
22
0
08 Aug 2022
Learning to Learn to Predict Performance Regressions in Production at
  Meta
Learning to Learn to Predict Performance Regressions in Production at Meta
M. Beller
Hongyu Li
V. Nair
V. Murali
Imad Ahmad
Jürgen Cito
Drew Carlson
Gareth Ari Aye
Wes Dyer
63
5
0
08 Aug 2022
Investigating Efficiently Extending Transformers for Long Input
  Summarization
Investigating Efficiently Extending Transformers for Long Input Summarization
Jason Phang
Yao-Min Zhao
Peter J. Liu
RALMLLMAG
85
63
0
08 Aug 2022
Analog Bits: Generating Discrete Data using Diffusion Models with
  Self-Conditioning
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Ting-Li Chen
Ruixiang Zhang
Geoffrey E. Hinton
DiffM
125
313
0
08 Aug 2022
Abstractive Meeting Summarization: A Survey
Abstractive Meeting Summarization: A Survey
Virgile Rennard
Guokan Shang
Julie Hunter
Michalis Vazirgiannis
101
16
0
08 Aug 2022
Social Simulacra: Creating Populated Prototypes for Social Computing
  Systems
Social Simulacra: Creating Populated Prototypes for Social Computing Systems
J. Park
Lindsay Popowski
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
85
298
0
08 Aug 2022
Advancing Plain Vision Transformer Towards Remote Sensing Foundation
  Model
Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model
Di Wang
Qiming Zhang
Yufei Xu
Jing Zhang
Bo Du
Dacheng Tao
Lefei Zhang
84
257
0
08 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over
  Heavy Ball Method in Training Over-Parameterized Neural Networks
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
78
1
0
08 Aug 2022
On Transfer of Adversarial Robustness from Pretraining to Downstream
  Tasks
On Transfer of Adversarial Robustness from Pretraining to Downstream Tasks
Laura Fee Nern
Harsh Raj
Maurice Georgi
Yash Sharma
AAML
97
4
0
07 Aug 2022
Frozen CLIP Models are Efficient Video Learners
Frozen CLIP Models are Efficient Video Learners
Ziyi Lin
Shijie Geng
Renrui Zhang
Peng Gao
Gerard de Melo
Xiaogang Wang
Jifeng Dai
Yu Qiao
Hongsheng Li
CLIPVLM
98
209
0
06 Aug 2022
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language
  Models
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
Margaret Li
Suchin Gururangan
Tim Dettmers
M. Lewis
Tim Althoff
Noah A. Smith
Luke Zettlemoyer
MoMe
110
154
0
05 Aug 2022
A Holistic Approach to Undesired Content Detection in the Real World
A Holistic Approach to Undesired Content Detection in the Real World
Todor Markov
Chong Zhang
Sandhini Agarwal
Tyna Eloundou
Teddy Lee
Steven Adler
Angela Jiang
L. Weng
125
237
0
05 Aug 2022
Previous
123...189190191...245246247
Next