ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,362 papers shown
Title
Generalization Properties of Retrieval-based Models
Generalization Properties of Retrieval-based Models
Soumya Basu
A. S. Rawat
Manzil Zaheer
65
6
0
06 Oct 2022
Learning to Reason With Relational Abstractions
Learning to Reason With Relational Abstractions
A. Nam
Mengye Ren
Chelsea Finn
James L. McClelland
ReLMLRM
103
5
0
06 Oct 2022
Privacy-Preserving Text Classification on BERT Embeddings with
  Homomorphic Encryption
Privacy-Preserving Text Classification on BERT Embeddings with Homomorphic Encryption
Garam Lee
Minsoo Kim
J. Park
Seung-won Hwang
Jung Hee Cheon
91
18
0
05 Oct 2022
Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors
Large Language Models are Pretty Good Zero-Shot Video Game Bug Detectors
Mohammad Reza Taesiri
Finlay Macklon
Yihe Wang
Hengshuo Shen
Cor-Paul Bezemer
ELMLLMAGMLLM
90
13
0
05 Oct 2022
Ask Me Anything: A simple strategy for prompting language models
Ask Me Anything: A simple strategy for prompting language models
Simran Arora
A. Narayan
Mayee F. Chen
Laurel J. Orr
Neel Guha
Kush S. Bhatia
Ines Chami
Frederic Sala
Christopher Ré
ReLMLRM
293
219
0
05 Oct 2022
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics
Ivan Kapelyukh
Vitalis Vosylius
Edward Johns
LM&RoDiffM
236
148
0
05 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng Zhang
Yuxiao Dong
Jie Tang
BDLLRM
386
1,101
0
05 Oct 2022
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
Tushar Khot
H. Trivedi
Matthew Finlayson
Yao Fu
Kyle Richardson
Peter Clark
Ashish Sabharwal
ReLMLRM
162
452
0
05 Oct 2022
Bayesian Prompt Learning for Image-Language Model Generalization
Bayesian Prompt Learning for Image-Language Model Generalization
Mohammad Mahdi Derakhshani
Enrique Sanchez
Adrian Bulat
Victor G. Turrisi da Costa
Cees G. M. Snoek
Georgios Tzimiropoulos
Brais Martínez
VPVLMVLM
171
37
0
05 Oct 2022
Progressive Text-to-Image Generation
Progressive Text-to-Image Generation
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
156
4
0
05 Oct 2022
TimesNet: Temporal 2D-Variation Modeling for General Time Series
  Analysis
TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis
Haixu Wu
Teng Hu
Yong Liu
Hang Zhou
Jianmin Wang
Mingsheng Long
AI4TSAIFin
158
843
0
05 Oct 2022
Fine-Tuning with Differential Privacy Necessitates an Additional
  Hyperparameter Search
Fine-Tuning with Differential Privacy Necessitates an Additional Hyperparameter Search
Yannis Cattan
Christopher A. Choquette-Choo
Nicolas Papernot
Abhradeep Thakurta
67
21
0
05 Oct 2022
TgDLF2.0: Theory-guided deep-learning for electrical load forecasting
  via Transformer and transfer learning
TgDLF2.0: Theory-guided deep-learning for electrical load forecasting via Transformer and transfer learning
Jiaxin Gao
Wenbo Hu
Dongxiao Zhang
Yuntian Chen
AI4TSAI4CE
83
2
0
05 Oct 2022
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Knowledge Unlearning for Mitigating Privacy Risks in Language Models
Joel Jang
Dongkeun Yoon
Sohee Yang
Sungmin Cha
Moontae Lee
Lajanugen Logeswaran
Minjoon Seo
KELMPILMMU
226
239
0
04 Oct 2022
Unveiling the Black Box of PLMs with Semantic Anchors: Towards
  Interpretable Neural Semantic Parsing
Unveiling the Black Box of PLMs with Semantic Anchors: Towards Interpretable Neural Semantic Parsing
L. Nie
Jiu Sun
Yanlin Wang
Lun Du
Lei Hou
Juanzi Li
Shi Han
Dongmei Zhang
Jidong Zhai
71
6
0
04 Oct 2022
Towards Flexible Inductive Bias via Progressive Reparameterization
  Scheduling
Towards Flexible Inductive Bias via Progressive Reparameterization Scheduling
Yunsung Lee
Gyuseong Lee
Kwang-seok Ryoo
Hyojun Go
Jihye Park
Seung Wook Kim
74
5
0
04 Oct 2022
Less is More: Task-aware Layer-wise Distillation for Language Model
  Compression
Less is More: Task-aware Layer-wise Distillation for Language Model Compression
Chen Liang
Simiao Zuo
Qingru Zhang
Pengcheng He
Weizhu Chen
Tuo Zhao
VLM
108
74
0
04 Oct 2022
Recitation-Augmented Language Models
Recitation-Augmented Language Models
Zhiqing Sun
Xuezhi Wang
Yi Tay
Yiming Yang
Denny Zhou
RALM
275
65
0
04 Oct 2022
ThinkSum: Probabilistic reasoning over sets using large language models
ThinkSum: Probabilistic reasoning over sets using large language models
Batu Mehmet Ozturkler
Nikolay Malkin
Zhen Wang
Nebojsa Jojic
ReLMLRM
138
22
0
04 Oct 2022
Robot Task Planning and Situation Handling in Open Worlds
Robot Task Planning and Situation Handling in Open Worlds
Yan Ding
Xiaohan Zhang
S. Amiri
Nieqing Cao
Hao Yang
Chad Esselink
Shiqi Zhang
LM&Ro
56
19
0
04 Oct 2022
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of
  Chain-of-Thought
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Abulhair Saparov
He He
ELMLRMReLM
301
315
0
03 Oct 2022
That Sounds Right: Auditory Self-Supervision for Dynamic Robot
  Manipulation
That Sounds Right: Auditory Self-Supervision for Dynamic Robot Manipulation
Abitha Thankaraj
Lerrel Pinto
68
17
0
03 Oct 2022
LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of
  Vision & Language Models
LASP: Text-to-Text Optimization for Language-Aware Soft Prompting of Vision & Language Models
Adrian Bulat
Georgios Tzimiropoulos
VLMVPVLM
62
51
0
03 Oct 2022
Contrastive Multimodal Learning for Emergence of Graphical Sensory-Motor
  Communication
Contrastive Multimodal Learning for Emergence of Graphical Sensory-Motor Communication
Tristan Karch
Yoann Lemesle
Romain Laroche
Clément Moulin-Frier
Pierre-Yves Oudeyer
61
1
0
03 Oct 2022
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth
  Pre-training
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training
Tianyu Huang
Bowen Dong
Yunhan Yang
Xiaoshui Huang
Rynson W. H. Lau
Wanli Ouyang
W. Zuo
VLM3DPCCLIP
136
150
0
03 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLMVLM
161
89
0
03 Oct 2022
Dancing with the Unexpected and Beyond: The Use of AI Assistance in
  Design Fiction Creation
Dancing with the Unexpected and Beyond: The Use of AI Assistance in Design Fiction Creation
Yiying Wu
Yunye Yu
Pengcheng An
37
6
0
03 Oct 2022
Complexity-Based Prompting for Multi-Step Reasoning
Complexity-Based Prompting for Multi-Step Reasoning
Yao Fu
Hao-Chun Peng
Ashish Sabharwal
Peter Clark
Tushar Khot
ReLMLRM
257
446
0
03 Oct 2022
A Non-monotonic Self-terminating Language Model
A Non-monotonic Self-terminating Language Model
Eugene Choi
Kyunghyun Cho
Cheolhyoung Lee
LRM
17
0
0
03 Oct 2022
Benign Autoencoders
Benign Autoencoders
Semyon Malamud
Teng Andrea Xu
Antoine Didisheim
DRLAI4CE
32
0
0
02 Oct 2022
Systematic Generalization and Emergent Structures in Transformers
  Trained on Structured Tasks
Systematic Generalization and Emergent Structures in Transformers Trained on Structured Tasks
Yuxuan Li
James L. McClelland
128
19
0
02 Oct 2022
MALM: Mixing Augmented Language Modeling for Zero-Shot Machine
  Translation
MALM: Mixing Augmented Language Modeling for Zero-Shot Machine Translation
Kshitij Gupta
VLMLRM
59
2
0
01 Oct 2022
CRISP: Curriculum based Sequential Neural Decoders for Polar Code Family
CRISP: Curriculum based Sequential Neural Decoders for Polar Code Family
Ashwin Hebbar
Viraj Nadkarni
Ashok Vardhan Makkuva
S. Bhat
Sewoong Oh
Pramod Viswanath
63
8
0
01 Oct 2022
Multimodal Analogical Reasoning over Knowledge Graphs
Multimodal Analogical Reasoning over Knowledge Graphs
Ningyu Zhang
Lei Li
Xiang Chen
Xiaozhuan Liang
Shumin Deng
Huajun Chen
135
28
0
01 Oct 2022
LambdaKG: A Library for Pre-trained Language Model-Based Knowledge Graph
  Embeddings
LambdaKG: A Library for Pre-trained Language Model-Based Knowledge Graph Embeddings
Xin Xie
Zhoubo Li
Xiaohan Wang
Feiyu Xiong
Ningyu Zhang
109
11
0
01 Oct 2022
Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease
  Classification with Incomplete Data
Cascaded Multi-Modal Mixing Transformers for Alzheimer's Disease Classification with Incomplete Data
Linfeng Liu
Siyu Liu
Lu Zhang
X. To
F. Nasrallah
Shekhar S. Chandra
MedIm
60
57
0
01 Oct 2022
FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation
FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation
Parker Riley
Timothy Dozat
Jan A. Botha
Xavier Garcia
Dan Garrette
Jason Riesa
Orhan Firat
Noah Constant
130
18
0
01 Oct 2022
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple
  Tasks
Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks
Zhenhailong Wang
Xiaoman Pan
Dian Yu
Dong Yu
Jianshu Chen
Heng Ji
VLM
109
10
0
01 Oct 2022
Predictive Inference with Feature Conformal Prediction
Predictive Inference with Feature Conformal Prediction
Jiaye Teng
Chuan Wen
Dinghuai Zhang
Yoshua Bengio
Yang Gao
Yang Yuan
127
27
0
01 Oct 2022
Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation
  in Machine Learning
Shockwave: Fair and Efficient Cluster Scheduling for Dynamic Adaptation in Machine Learning
Pengfei Zheng
Rui Pan
Tarannum Khan
Shivaram Venkataraman
Aditya Akella
88
22
0
30 Sep 2022
Differentially Private Optimization on Large Model at Small Cost
Differentially Private Optimization on Large Model at Small Cost
Zhiqi Bu
Yu Wang
Sheng Zha
George Karypis
112
55
0
30 Sep 2022
Differentially Private Bias-Term Fine-tuning of Foundation Models
Differentially Private Bias-Term Fine-tuning of Foundation Models
Zhiqi Bu
Yu Wang
Sheng Zha
George Karypis
126
48
0
30 Sep 2022
Out-of-Distribution Detection and Selective Generation for Conditional
  Language Models
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
Jie Jessie Ren
Jiaming Luo
Yao-Min Zhao
Kundan Krishna
Mohammad Saleh
Balaji Lakshminarayanan
Peter J. Liu
OODD
129
114
0
30 Sep 2022
Zero-Shot Retrieval with Search Agents and Hybrid Environments
Zero-Shot Retrieval with Search Agents and Hybrid Environments
Michelle Chen Huebscher
Christian Buck
Massimiliano Ciaramita
S. Rothe
137
9
0
30 Sep 2022
A Novel Explainable Out-of-Distribution Detection Approach for Spiking
  Neural Networks
A Novel Explainable Out-of-Distribution Detection Approach for Spiking Neural Networks
Aitor Martinez Seras
Javier Del Ser
J. Lobo
Pablo Garcia-Bringas
N. Kasabov
OODD
31
1
0
30 Sep 2022
AudioGen: Textually Guided Audio Generation
AudioGen: Textually Guided Audio Generation
Felix Kreuk
Gabriel Synnaeve
Adam Polyak
Uriel Singer
Alexandre Défossez
Jade Copet
Devi Parikh
Yaniv Taigman
Yossi Adi
DiffM
127
309
0
30 Sep 2022
EF21-P and Friends: Improved Theoretical Communication Complexity for
  Distributed Optimization with Bidirectional Compression
EF21-P and Friends: Improved Theoretical Communication Complexity for Distributed Optimization with Bidirectional Compression
Kaja Gruntkowska
Alexander Tyurin
Peter Richtárik
152
24
0
30 Sep 2022
What Makes Pre-trained Language Models Better Zero-shot Learners?
What Makes Pre-trained Language Models Better Zero-shot Learners?
Jinghui Lu
Dongsheng Zhu
Weidong Han
Rui Zhao
Brian Mac Namee
Fei Tan
101
24
0
30 Sep 2022
Self-Distillation for Further Pre-training of Transformers
Self-Distillation for Further Pre-training of Transformers
Seanie Lee
Minki Kang
Juho Lee
Sung Ju Hwang
Kenji Kawaguchi
96
8
0
30 Sep 2022
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient
  Classification
Depth-Wise Attention (DWAtt): A Layer Fusion Method for Data-Efficient Classification
Muhammad N. ElNokrashy
Badr AlKhamissi
Mona T. Diab
MoMe
90
5
0
30 Sep 2022
Previous
123...184185186...246247248
Next