ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,288 papers shown
Title
Learning Multiscale Transformer Models for Sequence Generation
Learning Multiscale Transformer Models for Sequence Generation
Bei Li
Tong Zheng
Yi Jing
Chengbo Jiao
Tong Xiao
Jingbo Zhu
70
9
0
19 Jun 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary
  Algorithm
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
121
35
0
19 Jun 2022
SAViR-T: Spatially Attentive Visual Reasoning with Transformers
SAViR-T: Spatially Attentive Visual Reasoning with Transformers
Pritish Sahu
Kalliopi Basioti
Vladimir Pavlovic
LRM
68
16
0
18 Jun 2022
Automatic Summarization of Russian Texts: Comparison of Extractive and
  Abstractive Methods
Automatic Summarization of Russian Texts: Comparison of Extractive and Abstractive Methods
Valeriya Goloviznina
Evgeny Kotelnikov
41
4
0
18 Jun 2022
Pre-training Enhanced Spatial-temporal Graph Neural Network for
  Multivariate Time Series Forecasting
Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting
Zezhi Shao
Zhao Zhang
Fei Wang
Yongjun Xu
AI4TS
109
228
0
18 Jun 2022
AnyMorph: Learning Transferable Polices By Inferring Agent Morphology
AnyMorph: Learning Transferable Polices By Inferring Agent Morphology
Brandon Trabucco
Mariano Phielipp
Glen Berseth
76
28
0
17 Jun 2022
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Unified-IO: A Unified Model for Vision, Language, and Multi-Modal Tasks
Jiasen Lu
Christopher Clark
Rowan Zellers
Roozbeh Mottaghi
Aniruddha Kembhavi
ObjDVLMMLLM
163
412
0
17 Jun 2022
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale
  Knowledge
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Linxi Fan
Guanzhi Wang
Yunfan Jiang
Ajay Mandlekar
Yuncong Yang
Haoyi Zhu
Andrew Tang
De-An Huang
Yuke Zhu
Anima Anandkumar
LM&Ro
144
388
0
17 Jun 2022
Characteristics of Harmful Text: Towards Rigorous Benchmarking of
  Language Models
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Maribeth Rauh
John F. J. Mellor
J. Uesato
Po-Sen Huang
Johannes Welbl
...
Amelia Glaese
G. Irving
Iason Gabriel
William S. Isaac
Lisa Anne Hendricks
124
52
0
16 Jun 2022
Zero-Shot Video Question Answering via Frozen Bidirectional Language
  Models
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
Antoine Yang
Antoine Miech
Josef Sivic
Ivan Laptev
Cordelia Schmid
149
239
0
16 Jun 2022
Self-Generated In-Context Learning: Leveraging Auto-regressive Language
  Models as a Demonstration Generator
Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator
Sungmin Cho
Hyunsoo Cho
Junyeob Kim
Taeuk Kim
Kang Min Yoo
Sang-goo Lee
104
66
0
16 Jun 2022
Patch-level Representation Learning for Self-supervised Vision
  Transformers
Patch-level Representation Learning for Self-supervised Vision Transformers
Sukmin Yun
Hankook Lee
Jaehyung Kim
Jinwoo Shin
ViT
120
68
0
16 Jun 2022
Multimodal Dialogue State Tracking
Multimodal Dialogue State Tracking
Hung Le
Nancy F. Chen
Guosheng Lin
67
9
0
16 Jun 2022
Let Invariant Rationale Discovery Inspire Graph Contrastive Learning
Let Invariant Rationale Discovery Inspire Graph Contrastive Learning
Changhao Nai
Xiang Wang
An Zhang
Y. Wu
Xiangnan He
Tat-Seng Chua
87
95
0
16 Jun 2022
TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained
  Language Models
TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models
A. Davody
David Ifeoluwa Adelani
Thomas Kleinbauer
Dietrich Klakow
75
4
0
15 Jun 2022
Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter
  Encoders for Natural Language Understanding Systems
Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems
Jack G. M. FitzGerald
Shankar Ananthakrishnan
Konstantine Arkoudas
Davide Bernardi
Abhishek Bhagia
...
Pan Wei
Haiyang Yu
Shuai Zheng
Gokhan Tur
Premkumar Natarajan
ELM
43
30
0
15 Jun 2022
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Jiahao Xie
Wei Li
Xiaohang Zhan
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
113
74
0
15 Jun 2022
Masked Siamese ConvNets
Masked Siamese ConvNets
L. Jing
Jiachen Zhu
Yann LeCun
SSL
114
35
0
15 Jun 2022
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
DIRECTOR: Generator-Classifiers For Supervised Language Modeling
Kushal Arora
Kurt Shuster
Sainbayar Sukhbaatar
Jason Weston
VLM
98
41
0
15 Jun 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELMReLMLRM
320
2,524
0
15 Jun 2022
A Unified Sequence Interface for Vision Tasks
A Unified Sequence Interface for Vision Tasks
Ting-Li Chen
Saurabh Saxena
Lala Li
Nayeon Lee
David J. Fleet
Geoffrey E. Hinton
VLMMLLM
81
152
0
15 Jun 2022
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
SP-ViT: Learning 2D Spatial Priors for Vision Transformers
Yuxuan Zhou
Wangmeng Xiang
Chong Li
Biao Wang
Xihan Wei
Lei Zhang
Margret Keuper
Xia Hua
ViT
71
15
0
15 Jun 2022
The SIGMORPHON 2022 Shared Task on Morpheme Segmentation
The SIGMORPHON 2022 Shared Task on Morpheme Segmentation
Khuyagbaatar Batsuren
Gábor Bella
Aryaman Arora
Viktor Martinović
Kyle Gorman
...
Magda vSevvcíková
Katevrina Pelegrinová
Fausto Giunchiglia
Ryan Cotterell
Ekaterina Vylomova
62
40
0
15 Jun 2022
NatGen: Generative pre-training by "Naturalizing" source code
NatGen: Generative pre-training by "Naturalizing" source code
Saikat Chakraborty
Toufique Ahmed
Yangruibo Ding
Prem Devanbu
Baishakhi Ray
AI4CE
116
117
0
15 Jun 2022
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and
  Future Directions
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions
Sheng Zhou
Hongjia Xu
Zhuonan Zheng
Jiawei Chen
Zhao Li
Jiajun Bu
Jia Wu
Xin Eric Wang
Wenwu Zhu
Martin Ester
99
103
0
15 Jun 2022
Forecasting of depth and ego-motion with transformers and
  self-supervision
Forecasting of depth and ego-motion with transformers and self-supervision
Houssem-eddine Boulahbal
A. Voicila
Andrew I. Comport
ViTMDE
69
3
0
15 Jun 2022
A smile is all you need: Predicting limiting activity coefficients from
  SMILES with natural language processing
A smile is all you need: Predicting limiting activity coefficients from SMILES with natural language processing
Benedikt Winter
Clemens Winter
J. Schilling
A. Bardow
73
28
0
15 Jun 2022
VCT: A Video Compression Transformer
VCT: A Video Compression Transformer
Fabian Mentzer
G. Toderici
David C. Minnen
S. Hwang
Sergi Caelles
Mario Lucic
E. Agustsson
ViT
68
108
0
15 Jun 2022
Understanding Narratives through Dimensions of Analogy
Understanding Narratives through Dimensions of Analogy
Thiloshon Nagarajah
Filip Ilievski
Jay Pujara
33
6
0
14 Jun 2022
LAVENDER: Unifying Video-Language Understanding as Masked Language
  Modeling
LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Linjie Li
Zhe Gan
Kevin Qinghong Lin
Chung-Ching Lin
Zicheng Liu
Ce Liu
Lijuan Wang
MLLMVLM
90
84
0
14 Jun 2022
Prioritized Training on Points that are Learnable, Worth Learning, and
  Not Yet Learnt
Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
Sören Mindermann
J. Brauner
Muhammed Razzak
Mrinank Sharma
Andreas Kirsch
...
Benedikt Höltgen
Aidan Gomez
Adrien Morisot
Sebastian Farquhar
Y. Gal
128
165
0
14 Jun 2022
Understanding the Generalization Benefit of Normalization Layers:
  Sharpness Reduction
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
121
75
0
14 Jun 2022
Text Generation with Text-Editing Models
Text Generation with Text-Editing Models
Eric Malmi
Yue Dong
Jonathan Mallinson
A. Chuklin
Jakub Adamek
Daniil Mirylenka
Felix Stahlberg
Sebastian Krause
Shankar Kumar
Aliaksei Severyn
KELM
60
26
0
14 Jun 2022
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
Matt Deitke
Eli VanderBilt
Alvaro Herrasti
Luca Weihs
Jordi Salvador
...
Winson Han
Eric Kolve
Ali Farhadi
Aniruddha Kembhavi
Roozbeh Mottaghi
LM&Ro
124
265
0
14 Jun 2022
FETILDA: An Effective Framework For Fin-tuned Embeddings For Long
  Financial Text Documents
FETILDA: An Effective Framework For Fin-tuned Embeddings For Long Financial Text Documents
BolunNamirXia
Vipula Rawte
Mohammed J Zaki
Aparna Gupta
AI4TS
13
1
0
14 Jun 2022
CERT: Continual Pre-Training on Sketches for Library-Oriented Code
  Generation
CERT: Continual Pre-Training on Sketches for Library-Oriented Code Generation
Daoguang Zan
Bei Chen
Dejian Yang
Zeqi Lin
Minsu Kim
Bei Guan
Yongji Wang
Weizhu Chen
Jian-Guang Lou
86
129
0
14 Jun 2022
The Metaverse Data Deluge: What Can We Do About It?
The Metaverse Data Deluge: What Can We Do About It?
Beng Chin Ooi
Gang Chen
Mike Zheng Shou
K. Tan
A. Tung
X. Xiao
J. Yip
Meihui Zhang
74
10
0
14 Jun 2022
Exploring Adversarial Attacks and Defenses in Vision Transformers
  trained with DINO
Exploring Adversarial Attacks and Defenses in Vision Transformers trained with DINO
Javier Rando
Nasib Naimi
Thomas Baumann
Max Mathys
AAML
53
6
0
14 Jun 2022
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning
  Tasks
LIFT: Language-Interfaced Fine-Tuning for Non-Language Machine Learning Tasks
Tuan Dinh
Yuchen Zeng
Ruisu Zhang
Ziqian Lin
Michael Gira
Shashank Rajput
Jy-yong Sohn
Dimitris Papailiopoulos
Kangwook Lee
LMTD
171
140
0
14 Jun 2022
Memory-Based Model Editing at Scale
Memory-Based Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Christopher D. Manning
Chelsea Finn
KELM
116
362
0
13 Jun 2022
Multimodal Learning with Transformers: A Survey
Multimodal Learning with Transformers: A Survey
Peng Xu
Xiatian Zhu
David Clifton
ViT
233
574
0
13 Jun 2022
Compositional Mixture Representations for Vision and Text
Compositional Mixture Representations for Vision and Text
Stephan Alaniz
Marco Federici
Zeynep Akata
CoGeOCLVLM
68
2
0
13 Jun 2022
Modern Distributed Data-Parallel Large-Scale Pre-training Strategies For
  NLP models
Modern Distributed Data-Parallel Large-Scale Pre-training Strategies For NLP models
Haoli Bai
MoE
138
5
0
13 Jun 2022
Language Models are General-Purpose Interfaces
Language Models are General-Purpose Interfaces
Y. Hao
Haoyu Song
Li Dong
Shaohan Huang
Zewen Chi
Wenhui Wang
Shuming Ma
Furu Wei
MLLM
76
102
0
13 Jun 2022
Self-critiquing models for assisting human evaluators
Self-critiquing models for assisting human evaluators
William Saunders
Catherine Yeh
Jeff Wu
Steven Bills
Ouyang Long
Jonathan Ward
Jan Leike
ALMELM
120
306
0
12 Jun 2022
APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking
APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking
Yuxiang Yang
Junjie Yang
Yufei Xu
Jing Zhang
Long Lan
Dacheng Tao
91
44
0
12 Jun 2022
Improving Pre-trained Language Model Fine-tuning with Noise Stability
  Regularization
Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization
Hang Hua
Xingjian Li
Dejing Dou
Chengzhong Xu
Jiebo Luo
94
15
0
12 Jun 2022
Building a Personalized Dialogue System with Prompt-Tuning
Building a Personalized Dialogue System with Prompt-Tuning
Tomohito Kasahara
Daisuke Kawahara
N. Tung
Sheng Li
K. Shinzato
Toshinori Sato
31
19
0
11 Jun 2022
Why is constrained neural language generation particularly challenging?
Why is constrained neural language generation particularly challenging?
Cristina Garbacea
Qiaozhu Mei
137
15
0
11 Jun 2022
Generalizable Neural Radiance Fields for Novel View Synthesis with
  Transformer
Generalizable Neural Radiance Fields for Novel View Synthesis with Transformer
Dan Wang
Xinrui Cui
Fellow Ieee Septimiu Salcudean
F. I. Z. Jane Wang
ViT
78
25
0
10 Jun 2022
Previous
123...193194195...244245246
Next