ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,427 papers shown
Title
A Neural ODE Interpretation of Transformer Layers
A Neural ODE Interpretation of Transformer Layers
Yaofeng Desmond Zhong
Tongtao Zhang
Amit Chakraborty
Biswadip Dey
139
10
0
12 Dec 2022
Continuation KD: Improved Knowledge Distillation through the Lens of
  Continuation Optimization
Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization
A. Jafari
I. Kobyzev
Mehdi Rezagholizadeh
Pascal Poupart
A. Ghodsi
VLM
75
5
0
12 Dec 2022
Federated Few-Shot Learning for Mobile NLP
Federated Few-Shot Learning for Mobile NLP
Dongqi Cai
Shangguang Wang
Yaozong Wu
F. Lin
Mengwei Xu
FedML
93
12
0
12 Dec 2022
Improving Generalization of Pre-trained Language Models via Stochastic
  Weight Averaging
Improving Generalization of Pre-trained Language Models via Stochastic Weight Averaging
Peng Lu
I. Kobyzev
Mehdi Rezagholizadeh
Ahmad Rashid
A. Ghodsi
Philippe Langlais
MoMe
104
11
0
12 Dec 2022
Collaborating Heterogeneous Natural Language Processing Tasks via
  Federated Learning
Collaborating Heterogeneous Natural Language Processing Tasks via Federated Learning
Chenhe Dong
Yuexiang Xie
Bolin Ding
Ying Shen
Yaliang Li
FedML
62
5
0
12 Dec 2022
On Pre-Training for Visuo-Motor Control: Revisiting a
  Learning-from-Scratch Baseline
On Pre-Training for Visuo-Motor Control: Revisiting a Learning-from-Scratch Baseline
Nicklas Hansen
Zhecheng Yuan
Yanjie Ze
Tongzhou Mu
Aravind Rajeswaran
H. Su
Huazhe Xu
Xiaolong Wang
89
66
0
12 Dec 2022
Implementing Deep Learning-Based Approaches for Article Summarization in
  Indian Languages
Implementing Deep Learning-Based Approaches for Article Summarization in Indian Languages
Rahul Tangsali
Aabha Pingle
Aditya Vyawahare
Isha Joshi
Raviraj Joshi
90
7
0
12 Dec 2022
A Study of Slang Representation Methods
A Study of Slang Representation Methods
Aravinda Kolla
Filip Ilievski
Hông-Ân Sandlin
Alain Mermoud
58
1
0
11 Dec 2022
Elixir: Train a Large Language Model on a Small GPU Cluster
Elixir: Train a Large Language Model on a Small GPU Cluster
Haichen Huang
Jiarui Fang
Hongxin Liu
Shenggui Li
Yang You
VLM
79
7
0
10 Dec 2022
MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base
MAPS-KB: A Million-scale Probabilistic Simile Knowledge Base
Qi He
Xintao Wang
Jiaqing Liang
Yanghua Xiao
78
3
0
10 Dec 2022
Structured information extraction from complex scientific text with
  fine-tuned large language models
Structured information extraction from complex scientific text with fine-tuned large language models
Alex Dunn
John Dagdelen
Nicholas Walker
Sanghoon Lee
Andrew S. Rosen
Gerbrand Ceder
Kristin A. Persson
Anubhav Jain
97
93
0
10 Dec 2022
LEAD: Liberal Feature-based Distillation for Dense Retrieval
LEAD: Liberal Feature-based Distillation for Dense Retrieval
Hao Sun
Xiao Liu
Yeyun Gong
Anlei Dong
Jing Lu
Yan Zhang
Linjun Yang
Rangan Majumder
Nan Duan
119
2
0
10 Dec 2022
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with
  Multi-Source Multimodal Knowledge Memory
REVEAL: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory
Ziniu Hu
Ahmet Iscen
Chen Sun
Zirui Wang
Kai-Wei Chang
Yizhou Sun
Cordelia Schmid
David A. Ross
Alireza Fathi
RALMVLM
103
96
0
10 Dec 2022
MAGVIT: Masked Generative Video Transformer
MAGVIT: Masked Generative Video Transformer
Lijun Yu
Yong Cheng
Kihyuk Sohn
José Lezama
Han Zhang
...
Alexander G. Hauptmann
Ming-Hsuan Yang
Yuan Hao
Irfan Essa
Lu Jiang
DiffMVGen
121
248
0
10 Dec 2022
SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing
SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing
Chaoyang He
Shuai Zheng
Aston Zhang
George Karypis
Trishul Chilimbi
Mahdi Soltanolkotabi
Salman Avestimehr
MoE
50
1
0
10 Dec 2022
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Aran Komatsuzaki
J. Puigcerver
James Lee-Thorp
Carlos Riquelme Ruiz
Basil Mustafa
Joshua Ainslie
Yi Tay
Mostafa Dehghani
N. Houlsby
MoMeMoE
106
124
0
09 Dec 2022
Audiovisual Masked Autoencoders
Audiovisual Masked Autoencoders
Mariana-Iuliana Georgescu
Eduardo Fonseca
Radu Tudor Ionescu
Mario Lucic
Cordelia Schmid
Anurag Arnab
SSL
118
45
0
09 Dec 2022
Spurious Features Everywhere -- Large-Scale Detection of Harmful
  Spurious Features in ImageNet
Spurious Features Everywhere -- Large-Scale Detection of Harmful Spurious Features in ImageNet
Yannic Neuhaus
Maximilian Augustin
Valentyn Boreiko
Matthias Hein
AAML
134
32
0
09 Dec 2022
Open-world Story Generation with Structured Knowledge Enhancement: A
  Comprehensive Survey
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey
Yuxin Wang
Jieru Lin
Zhiwei Yu
Wei Hu
Börje F. Karlsson
134
20
0
09 Dec 2022
Training Data Influence Analysis and Estimation: A Survey
Training Data Influence Analysis and Estimation: A Survey
Zayd Hammoudeh
Daniel Lowd
TDI
119
101
0
09 Dec 2022
Structured Like a Language Model: Analysing AI as an Automated Subject
Structured Like a Language Model: Analysing AI as an Automated Subject
Liam Magee
Vanicka Arora
Luke Munn
16
21
0
08 Dec 2022
VideoDex: Learning Dexterity from Internet Videos
VideoDex: Learning Dexterity from Internet Videos
Kenneth Shaw
Shikhar Bahl
Deepak Pathak
101
96
0
08 Dec 2022
Learning Video Representations from Large Language Models
Learning Video Representations from Large Language Models
Yue Zhao
Ishan Misra
Philipp Krahenbuhl
Rohit Girdhar
VLMAI4TS
118
177
0
08 Dec 2022
General-Purpose In-Context Learning by Meta-Learning Transformers
General-Purpose In-Context Learning by Meta-Learning Transformers
Louis Kirsch
James Harrison
Jascha Narain Sohl-Dickstein
Luke Metz
131
78
0
08 Dec 2022
Task Bias in Vision-Language Models
Task Bias in Vision-Language Models
Sachit Menon
I. Chandratreya
Carl Vondrick
VLMSSL
72
6
0
08 Dec 2022
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist
  Models
OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist Models
Jinze Bai
Rui Men
Han Yang
Xuancheng Ren
Kai Dang
...
Wenhang Ge
Jianxin Ma
Junyang Lin
Jingren Zhou
Chang Zhou
88
16
0
08 Dec 2022
A Rubric for Human-like Agents and NeuroAI
A Rubric for Human-like Agents and NeuroAI
Ida Momennejad
128
14
0
08 Dec 2022
Skellam Mixture Mechanism: a Novel Approach to Federated Learning with
  Differential Privacy
Skellam Mixture Mechanism: a Novel Approach to Federated Learning with Differential Privacy
Ergute Bao
Yizheng Zhu
X. Xiao
Yifan Yang
Beng Chin Ooi
B. Tan
Khin Mi Mi Aung
FedML
82
19
0
08 Dec 2022
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level
  Natural Language Explanations
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
Björn Plüster
Jakob Ambsdorf
Lukas Braach
Jae Hee Lee
S. Wermter
76
6
0
08 Dec 2022
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding
DC-MBR: Distributional Cooling for Minimum Bayesian Risk Decoding
Jianhao Yan
Jin Xu
Fandong Meng
Jie Zhou
Yue Zhang
107
4
0
08 Dec 2022
Deep Incubation: Training Large Models by Divide-and-Conquering
Deep Incubation: Training Large Models by Divide-and-Conquering
Zanlin Ni
Yulin Wang
Jiangwei Yu
Haojun Jiang
Yu Cao
Gao Huang
VLM
92
11
0
08 Dec 2022
Group Generalized Mean Pooling for Vision Transformer
Group Generalized Mean Pooling for Vision Transformer
ByungSoo Ko
Han-Gyu Kim
Byeongho Heo
Sangdoo Yun
Sanghyuk Chun
Geonmo Gu
Wonjae Kim
ViT
88
1
0
08 Dec 2022
NP4G : Network Programming for Generalization
NP4G : Network Programming for Generalization
Shoichiro Hara
Yuji Watanabe
NAIAI4CE
20
0
0
08 Dec 2022
Successive Prompting for Decomposing Complex Questions
Successive Prompting for Decomposing Complex Questions
Dheeru Dua
Shivanshu Gupta
Sameer Singh
Matt Gardner
ReLMLRM
111
118
0
08 Dec 2022
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELMMoMeMU
213
523
0
08 Dec 2022
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large
  Language Models
LLM-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models
Chan Hee Song
Jiaman Wu
Clay Washington
Brian M Sadler
Wei-Lun Chao
Yu-Chuan Su
LLMAGLM&Ro
173
425
0
08 Dec 2022
Demystifying Prompts in Language Models via Perplexity Estimation
Demystifying Prompts in Language Models via Perplexity Estimation
Hila Gonen
Srini Iyer
Terra Blevins
Noah A. Smith
Luke Zettlemoyer
LRM
159
214
0
08 Dec 2022
Transfer learning for chemically accurate interatomic neural network
  potentials
Transfer learning for chemically accurate interatomic neural network potentials
Viktor Zaverkin
David Holzmüller
Luca Bonfirraro
Johannes Kastner
101
25
0
07 Dec 2022
Discovering Latent Knowledge in Language Models Without Supervision
Discovering Latent Knowledge in Language Models Without Supervision
Collin Burns
Haotian Ye
Dan Klein
Jacob Steinhardt
163
386
0
07 Dec 2022
Robustness of Learning from Task Instructions
Robustness of Learning from Task Instructions
Jiasheng Gu
Hongyu Zhao
Hanzi Xu
Liang Nie
Hongyuan Mei
Wenpeng Yin
OOD
101
34
0
07 Dec 2022
Pivotal Role of Language Modeling in Recommender Systems: Enriching
  Task-specific and Task-agnostic Representation Learning
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Kyuyong Shin
Hanock Kwak
Wonjae Kim
Jisu Jeong
Seungjae Jung
KyungHyun Kim
Jung-Woo Ha
Sang-Woo Lee
84
4
0
07 Dec 2022
Memorization of Named Entities in Fine-tuned BERT Models
Memorization of Named Entities in Fine-tuned BERT Models
Andor Diera
N. Lell
Aygul Garifullina
A. Scherp
68
0
0
07 Dec 2022
Harnessing Knowledge and Reasoning for Human-Like Natural Language
  Generation: A Brief Review
Harnessing Knowledge and Reasoning for Human-Like Natural Language Generation: A Brief Review
Jiangjie Chen
Yanghua Xiao
116
5
0
07 Dec 2022
The problem with AI consciousness: A neurogenetic case against synthetic
  sentience
The problem with AI consciousness: A neurogenetic case against synthetic sentience
Yoshija Walter
L. Zbinden
44
1
0
07 Dec 2022
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and
  Training Efficiency via Efficient Data Sampling and Routing
DeepSpeed Data Efficiency: Improving Deep Learning Model Quality and Training Efficiency via Efficient Data Sampling and Routing
Conglong Li
Z. Yao
Xiaoxia Wu
Minjia Zhang
Connor Holmes
Cheng Li
Yuxiong He
69
25
0
07 Dec 2022
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
Ziqi Zhou
Bowen Zhang
Yinjie Lei
Lingqiao Liu
Yifan Liu
VLM
97
176
0
07 Dec 2022
Talking About Large Language Models
Talking About Large Language Models
Murray Shanahan
AI4CE
132
275
0
07 Dec 2022
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Text Embeddings by Weakly-Supervised Contrastive Pre-training
Liang Wang
Nan Yang
Xiaolong Huang
Binxing Jiao
Linjun Yang
Daxin Jiang
Rangan Majumder
Furu Wei
VLM
263
624
0
07 Dec 2022
"It would work for me too": How Online Communities Shape Software
  Developers' Trust in AI-Powered Code Generation Tools
"It would work for me too": How Online Communities Shape Software Developers' Trust in AI-Powered Code Generation Tools
Ruijia Cheng
Ruotong Wang
Thomas Zimmermann
Denae Ford
109
33
0
07 Dec 2022
SimVTP: Simple Video Text Pre-training with Masked Autoencoders
SimVTP: Simple Video Text Pre-training with Masked Autoencoders
Yue Ma
Tianyu Yang
Yin Shan
Xiu Li
88
27
0
07 Dec 2022
Previous
123...172173174...247248249
Next