ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,362 papers shown
Title
Life is a Circus and We are the Clowns: Automatically Finding Analogies
  between Situations and Processes
Life is a Circus and We are the Clowns: Automatically Finding Analogies between Situations and Processes
Oren Sultan
Dafna Shahaf
84
28
0
21 Oct 2022
WikiWhy: Answering and Explaining Cause-and-Effect Questions
WikiWhy: Answering and Explaining Cause-and-Effect Questions
Matthew Ho
Aditya Sharma
Justin Chang
Michael Stephen Saxon
Sharon Levy
Yujie Lu
William Yang Wang
ReLMKELMLRM
159
19
0
21 Oct 2022
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun
  Dependencies?
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies?
Mitja Nikolaus
Emmanuelle Salin
Stéphane Ayache
Abdellah Fourtassi
Benoit Favre
76
14
0
21 Oct 2022
Neuro-Symbolic Causal Reasoning Meets Signaling Game for Emergent
  Semantic Communications
Neuro-Symbolic Causal Reasoning Meets Signaling Game for Emergent Semantic Communications
Christo Kurisummoottil Thomas
Walid Saad
76
35
0
21 Oct 2022
Evolution of Neural Tangent Kernels under Benign and Adversarial
  Training
Evolution of Neural Tangent Kernels under Benign and Adversarial Training
Noel Loo
Ramin Hasani
Alexander Amini
Daniela Rus
AAML
86
13
0
21 Oct 2022
A Causal Framework to Quantify the Robustness of Mathematical Reasoning
  with Language Models
A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models
Alessandro Stolfo
Zhijing Jin
Kumar Shridhar
Bernhard Schölkopf
Mrinmaya Sachan
ELMOODLRM
145
66
0
21 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
When Expressivity Meets Trainability: Fewer than nnn Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
127
10
0
21 Oct 2022
Optimizing text representations to capture (dis)similarity between
  political parties
Optimizing text representations to capture (dis)similarity between political parties
Tanise Ceron
Nico Blokker
Sebastian Padó
47
6
0
21 Oct 2022
Is Encoder-Decoder Redundant for Neural Machine Translation?
Is Encoder-Decoder Redundant for Neural Machine Translation?
Yingbo Gao
Christian Herold
Zijian Yang
Hermann Ney
76
4
0
21 Oct 2022
Augmentation with Projection: Towards an Effective and Efficient Data
  Augmentation Paradigm for Distillation
Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation
Ziqi Wang
Yuexin Wu
Frederick Liu
Daogao Liu
Le Hou
Hongkun Yu
Jing Li
Heng Ji
81
5
0
21 Oct 2022
Efficiently Tuned Parameters are Task Embeddings
Efficiently Tuned Parameters are Task Embeddings
Wangchunshu Zhou
Canwen Xu
Julian McAuley
58
8
0
21 Oct 2022
Graphically Structured Diffusion Models
Graphically Structured Diffusion Models
Christian D. Weilbach
William Harvey
Frank Wood
DiffM
87
7
0
20 Oct 2022
Using Large Language Models to Enhance Programming Error Messages
Using Large Language Models to Enhance Programming Error Messages
Juho Leinonen
Arto Hellas
Sami Sarsa
B. Reeves
Paul Denny
James Prather
Brett A. Becker
67
187
0
20 Oct 2022
Boosting Natural Language Generation from Instructions with
  Meta-Learning
Boosting Natural Language Generation from Instructions with Meta-Learning
Budhaditya Deb
Guoqing Zheng
Ahmed Hassan Awadallah
72
16
0
20 Oct 2022
Large Language Models Can Self-Improve
Large Language Models Can Self-Improve
Jiaxin Huang
S. Gu
Le Hou
Yuexin Wu
Xuezhi Wang
Hongkun Yu
Jiawei Han
ReLMAI4MHLRM
224
618
0
20 Oct 2022
3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows
3DALL-E: Integrating Text-to-Image AI in 3D Design Workflows
Vivian Liu
Jo Vermeulen
G. Fitzmaurice
Justin Matejka
HAI
88
126
0
20 Oct 2022
Dense Paraphrasing for Textual Enrichment
Dense Paraphrasing for Textual Enrichment
Jingxuan Tu
Kyeongmin Rim
E. Holderness
James Pustejovsky
62
6
0
20 Oct 2022
Composing Ensembles of Pre-trained Models via Iterative Consensus
Composing Ensembles of Pre-trained Models via Iterative Consensus
Shuang Li
Yilun Du
J. Tenenbaum
Antonio Torralba
Igor Mordatch
MoMe
73
25
0
20 Oct 2022
ObSynth: An Interactive Synthesis System for Generating Object Models
  from Natural Language Specifications
ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications
Alex Gu
Tamara Mitrovska
D. Vélez
Jacob Andreas
Armando Solar-Lezama
SyDa
76
1
0
20 Oct 2022
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLMLRM
303
3,177
0
20 Oct 2022
Transcending Scaling Laws with 0.1% Extra Compute
Transcending Scaling Laws with 0.1% Extra Compute
Yi Tay
Jason W. Wei
Hyung Won Chung
Vinh Q. Tran
David R. So
...
Donald Metzler
Slav Petrov
N. Houlsby
Quoc V. Le
Mostafa Dehghani
LRM
109
71
0
20 Oct 2022
Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts
Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts
Xiangyang Liu
Tianxiang Sun
Xuanjing Huang
Xipeng Qiu
VLM
103
29
0
20 Oct 2022
Disentangling Reasoning Capabilities from Language Models with
  Compositional Reasoning Transformers
Disentangling Reasoning Capabilities from Language Models with Compositional Reasoning Transformers
Wanjun Zhong
Tingting Ma
Jiahai Wang
Jian Yin
Tiejun Zhao
Chin-Yew Lin
Nan Duan
LRMCoGe
79
2
0
20 Oct 2022
Pre-training Language Models with Deterministic Factual Knowledge
Pre-training Language Models with Deterministic Factual Knowledge
Shaobo Li
Xiaoguang Li
Lifeng Shang
Chengjie Sun
Bingquan Liu
Zhenzhou Ji
Xin Jiang
Qun Liu
KELM
99
11
0
20 Oct 2022
General Image Descriptors for Open World Image Retrieval using ViT CLIP
General Image Descriptors for Open World Image Retrieval using ViT CLIP
Marcos V. Conde
Ivan Aerlic
Simon Jégou
CLIP
80
2
0
20 Oct 2022
Freeze then Train: Towards Provable Representation Learning under
  Spurious Correlations and Feature Noise
Freeze then Train: Towards Provable Representation Learning under Spurious Correlations and Feature Noise
Haotian Ye
James Zou
Linjun Zhang
OOD
82
23
0
20 Oct 2022
MovieCLIP: Visual Scene Recognition in Movies
MovieCLIP: Visual Scene Recognition in Movies
Digbalay Bose
Rajat Hebbar
Krishna Somandepalli
Haoyang Zhang
Huayu Chen
K. Cole-McLaughlin
Haoran Wang
Shrikanth Narayanan
CLIP
83
22
0
20 Oct 2022
Palm up: Playing in the Latent Manifold for Unsupervised Pretraining
Palm up: Playing in the Latent Manifold for Unsupervised Pretraining
Hao Liu
Tom Zahavy
Volodymyr Mnih
Satinder Singh
SSL
108
7
0
19 Oct 2022
lo-fi: distributed fine-tuning without communication
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
103
24
0
19 Oct 2022
TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun
  Distillation
TOIST: Task Oriented Instance Segmentation Transformer with Noun-Pronoun Distillation
Pengfei Li
Beiwen Tian
Yongliang Shi
Xiaoxue Chen
Hao Zhao
Guyue Zhou
Ya Zhang
118
22
0
19 Oct 2022
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement
  Learning
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
Yifan Xu
Nicklas Hansen
Zirui Wang
Yung-Chieh Chan
H. Su
Zhuowen Tu
OffRL
77
17
0
19 Oct 2022
Scaling Laws for Reward Model Overoptimization
Scaling Laws for Reward Model Overoptimization
Leo Gao
John Schulman
Jacob Hilton
ALM
131
569
0
19 Oct 2022
Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
  Construction
Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph Construction
Yunzhi Yao
Shengyu Mao
Ningyu Zhang
Xiangnan Chen
Shumin Deng
Xi Chen
Huajun Chen
127
12
0
19 Oct 2022
Robustness of Demonstration-based Learning Under Limited Data Scenario
Robustness of Demonstration-based Learning Under Limited Data Scenario
Hongxin Zhang
Yanzhe Zhang
Ruiyi Zhang
Diyi Yang
93
15
0
19 Oct 2022
Language Models Understand Us, Poorly
Language Models Understand Us, Poorly
Jared Moore
LRM
55
4
0
19 Oct 2022
Towards Realistic Low-resource Relation Extraction: A Benchmark with
  Empirical Baseline Study
Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study
Xin Xu
Xiang Chen
Ningyu Zhang
Xin Xie
Xi Chen
Huajun Chen
105
10
0
19 Oct 2022
Extending Graph Transformers with Quantum Computed Aggregation
Extending Graph Transformers with Quantum Computed Aggregation
Slimane Thabet
Romain Fouilland
L. Henriet
GNN
50
3
0
19 Oct 2022
Towards a neural architecture of language: Deep learning versus
  logistics of access in neural architectures for compositional processing
Towards a neural architecture of language: Deep learning versus logistics of access in neural architectures for compositional processing
F. Velde
38
0
0
19 Oct 2022
Attribution and Obfuscation of Neural Text Authorship: A Data Mining
  Perspective
Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
Adaku Uchendu
Thai Le
Dongwon Lee
DeLMO
121
45
0
19 Oct 2022
CPL: Counterfactual Prompt Learning for Vision and Language Models
CPL: Counterfactual Prompt Learning for Vision and Language Models
Xuehai He
Diji Yang
Weixi Feng
Tsu-Jui Fu
Arjun Reddy Akula
Varun Jampani
P. Narayana
Sugato Basu
William Yang Wang
Xinze Wang
VPVLMVLM
96
15
0
19 Oct 2022
BioGPT: Generative Pre-trained Transformer for Biomedical Text
  Generation and Mining
BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
Renqian Luo
Liai Sun
Yingce Xia
Tao Qin
Sheng Zhang
Hoifung Poon
Tie-Yan Liu
MedImAI4CELM&MA
139
857
0
19 Oct 2022
Forging Multiple Training Objectives for Pre-trained Language Models via
  Meta-Learning
Forging Multiple Training Objectives for Pre-trained Language Models via Meta-Learning
Hongqiu Wu
Ruixue Ding
Haizhen Zhao
Boli Chen
Pengjun Xie
Fei Huang
Min Zhang
MoMe
95
8
0
19 Oct 2022
Continued Pretraining for Better Zero- and Few-Shot Promptability
Continued Pretraining for Better Zero- and Few-Shot Promptability
Zhaofeng Wu
IV RobertL.Logan
Pete Walsh
Akshita Bhagia
Dirk Groeneveld
Sameer Singh
Iz Beltagy
VLM
106
12
0
19 Oct 2022
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models
Luke Vilnis
Yury Zemlyanskiy
Patrick C. Murray
Alexandre Passos
Sumit Sanghai
105
10
0
18 Oct 2022
Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual
  Question Answering
Entity-Focused Dense Passage Retrieval for Outside-Knowledge Visual Question Answering
Jialin Wu
Raymond J. Mooney
RALM
138
11
0
18 Oct 2022
ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler
ELASTIC: Numerical Reasoning with Adaptive Symbolic Compiler
Jiaxin Zhang
Yashar Moshfeghi
AIMat
68
18
0
18 Oct 2022
From Play to Policy: Conditional Behavior Generation from Uncurated
  Robot Data
From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
Zichen Jeff Cui
Yibin Wang
Nur Muhammad (Mahi) Shafiullah
Lerrel Pinto
LM&RoVGenOffRL
100
95
0
18 Oct 2022
SafeText: A Benchmark for Exploring Physical Safety in Language Models
SafeText: A Benchmark for Exploring Physical Safety in Language Models
Sharon Levy
Emily Allaway
Melanie Subbiah
Lydia B. Chilton
D. Patton
Kathleen McKeown
William Yang Wang
96
45
0
18 Oct 2022
The Tail Wagging the Dog: Dataset Construction Biases of Social Bias
  Benchmarks
The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks
Nikil Selvam
Sunipa Dev
Daniel Khashabi
Tushar Khot
Kai-Wei Chang
ALM
72
26
0
18 Oct 2022
Tiny-Attention Adapter: Contexts Are More Important Than the Number of
  Parameters
Tiny-Attention Adapter: Contexts Are More Important Than the Number of Parameters
Hongyu Zhao
Hao Tan
Hongyuan Mei
MoE
81
18
0
18 Oct 2022
Previous
123...180181182...246247248
Next