ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXivPDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 11,025 papers shown
Title
Do Different Tracking Tasks Require Different Appearance Models?
Do Different Tracking Tasks Require Different Appearance Models?
Zhongdao Wang
Hengshuang Zhao
Yali Li
Shengjin Wang
Philip Torr
Luca Bertinetto
37
82
0
05 Jul 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes
Random Neural Networks in the Infinite Width Limit as Gaussian Processes
Boris Hanin
BDL
32
43
0
04 Jul 2021
BAGUA: Scaling up Distributed Learning with System Relaxations
BAGUA: Scaling up Distributed Learning with System Relaxations
Shaoduo Gan
Xiangru Lian
Rui Wang
Jianbin Chang
Chengjun Liu
...
Jiawei Jiang
Binhang Yuan
Sen Yang
Ji Liu
Ce Zhang
25
30
0
03 Jul 2021
Learning Efficient Vision Transformers via Fine-Grained Manifold
  Distillation
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
Zhiwei Hao
Jianyuan Guo
Ding Jia
Kai Han
Yehui Tang
Chao Zhang
Dacheng Tao
Yunhe Wang
ViT
33
68
0
03 Jul 2021
Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework
  for Scrutinizing Machine Text
Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text
Yao Dou
Maxwell Forbes
Rik Koncel-Kedziorski
Noah A. Smith
Yejin Choi
DeLMO
17
126
0
02 Jul 2021
Solving Machine Learning Problems
Solving Machine Learning Problems
Sunny Tran
P. Krishna
Ishan Pakuwal
Prabhakar Kafle
Nikhil Singh
J. Lynch
Iddo Drori
VLM
21
11
0
02 Jul 2021
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement
  Learning Agents
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents
Grgur Kovač
Rémy Portelas
Katja Hofmann
Pierre-Yves Oudeyer
ALM
27
6
0
02 Jul 2021
Neural Task Success Classifiers for Robotic Manipulation from Few Real
  Demonstrations
Neural Task Success Classifiers for Robotic Manipulation from Few Real Demonstrations
A. Mohtasib
Amir Ghalamzan
Nicola Bellotto
Heriberto Cuay´ahuitl
13
1
0
01 Jul 2021
Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under
  Data Augmentation
Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation
Nicklas Hansen
H. Su
Xiaolong Wang
OffRL
26
134
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision
  Transformers
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
42
428
0
01 Jul 2021
All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated
  Text
All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text
Elizabeth Clark
Tal August
Sofia Serrano
Nikita Haduong
Suchin Gururangan
Noah A. Smith
DeLMO
51
394
0
30 Jun 2021
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin
  Information
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information
Zijun Sun
Xiaoya Li
Xiaofei Sun
Yuxian Meng
Xiang Ao
Qing He
Fei Wu
Jiwei Li
SSeg
57
183
0
30 Jun 2021
A Generative Model for Raw Audio Using Transformer Architectures
A Generative Model for Raw Audio Using Transformer Architectures
Prateek Verma
C. Chafe
27
28
0
30 Jun 2021
Improving the Efficiency of Transformers for Resource-Constrained
  Devices
Improving the Efficiency of Transformers for Resource-Constrained Devices
Hamid Tabani
Ajay Balasubramaniam
Shabbir Marzban
Elahe Arani
Bahram Zonooz
41
20
0
30 Jun 2021
SCARF: Self-Supervised Contrastive Learning using Random Feature
  Corruption
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri
Heinrich Jiang
Yi Tay
Donald Metzler
SSL
19
163
0
29 Jun 2021
Time-Aware Language Models as Temporal Knowledge Bases
Time-Aware Language Models as Temporal Knowledge Bases
Bhuwan Dhingra
Jeremy R. Cole
Julian Martin Eisenschlos
D. Gillick
Jacob Eisenstein
William W. Cohen
KELM
28
264
0
29 Jun 2021
Deep Ensembling with No Overhead for either Training or Testing: The
  All-Round Blessings of Dynamic Sparsity
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
Decebal Constantin Mocanu
OOD
31
49
0
28 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
47
424
0
28 Jun 2021
Word2Box: Capturing Set-Theoretic Semantics of Words using Box
  Embeddings
Word2Box: Capturing Set-Theoretic Semantics of Words using Box Embeddings
S. Dasgupta
Michael Boratko
Siddhartha Mishra
Shriya Atmakuri
Dhruvesh Patel
Xiang Lorraine Li
Andrew McCallum
NAI
28
21
0
28 Jun 2021
Stabilizing Equilibrium Models by Jacobian Regularization
Stabilizing Equilibrium Models by Jacobian Regularization
Shaojie Bai
V. Koltun
J. Zico Kolter
25
57
0
28 Jun 2021
Pairing Conceptual Modeling with Machine Learning
Pairing Conceptual Modeling with Machine Learning
W. Maass
V. Storey
HAI
27
33
0
27 Jun 2021
SymbolicGPT: A Generative Transformer Model for Symbolic Regression
SymbolicGPT: A Generative Transformer Model for Symbolic Regression
Mojtaba Valipour
Bowen You
Maysum Panju
A. Ghodsi
18
88
0
27 Jun 2021
Visual Conceptual Blending with Large-scale Language and Vision Models
Visual Conceptual Blending with Large-scale Language and Vision Models
Songwei Ge
Devi Parikh
VLM
DiffM
27
14
0
27 Jun 2021
Core Challenges in Embodied Vision-Language Planning
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
47
45
0
26 Jun 2021
Multimodal Few-Shot Learning with Frozen Language Models
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
58
749
0
25 Jun 2021
Transflower: probabilistic autoregressive dance generation with
  multimodal attention
Transflower: probabilistic autoregressive dance generation with multimodal attention
Guillermo Valle Pérez
G. Henter
Jonas Beskow
A. Holzapfel
Pierre-Yves Oudeyer
Simon Alexanderson
30
42
0
25 Jun 2021
ParaLaw Nets -- Cross-lingual Sentence-level Pretraining for Legal Text
  Processing
ParaLaw Nets -- Cross-lingual Sentence-level Pretraining for Legal Text Processing
Nguyen Ha Thanh
Vu Tran
Phuong Minh Nguyen
Thi-Hai-Yen Vuong
Quan Minh Bui
Chau Nguyen
Binh Dang
Minh Le Nguyen
Kenji Satoh
AILaw
27
10
0
25 Jun 2021
Domain-Specific Pretraining for Vertical Search: Case Study on
  Biomedical Literature
Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature
Yu-Chiang Frank Wang
Jinchao Li
Tristan Naumann
Chenyan Xiong
Hao Cheng
...
Yang Qin
Eric Horvitz
Paul N. Bennett
Jianfeng Gao
Hoifung Poon
OOD
33
13
0
25 Jun 2021
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with
  Language Models
Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models
Robert L Logan IV
Ivana Balavzević
Eric Wallace
Fabio Petroni
Sameer Singh
Sebastian Riedel
VPVLM
39
207
0
24 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
52
314
0
24 Jun 2021
Autoformer: Decomposition Transformers with Auto-Correlation for
  Long-Term Series Forecasting
Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting
Haixu Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
AI4TS
53
2,098
0
24 Jun 2021
DocFormer: End-to-End Transformer for Document Understanding
DocFormer: End-to-End Transformer for Document Understanding
Srikar Appalaraju
Bhavan A. Jasani
Bhargava Urala Kota
Yusheng Xie
R. Manmatha
ViT
41
270
0
22 Jun 2021
Do Language Models Perform Generalizable Commonsense Inference?
Do Language Models Perform Generalizable Commonsense Inference?
Peifeng Wang
Filip Ilievski
Muhao Chen
Xiang Ren
ReLM
LRM
25
19
0
22 Jun 2021
GAIA: A Transfer Learning System of Object Detection that Fits Your
  Needs
GAIA: A Transfer Learning System of Object Detection that Fits Your Needs
Xingyuan Bu
Junran Peng
Junjie Yan
Tieniu Tan
Zhaoxiang Zhang
ObjD
VLM
31
53
0
21 Jun 2021
Secure Distributed Training at Scale
Secure Distributed Training at Scale
Eduard A. Gorbunov
Alexander Borzunov
Michael Diskin
Max Ryabinin
FedML
26
15
0
21 Jun 2021
Software-Based Dialogue Systems: Survey, Taxonomy and Challenges
Software-Based Dialogue Systems: Survey, Taxonomy and Challenges
Quim Motger
Xavier Franch
Jordi Marco
26
40
0
21 Jun 2021
Zero-shot learning approach to adaptive Cybersecurity using Explainable
  AI
Zero-shot learning approach to adaptive Cybersecurity using Explainable AI
Dattaraj J. Rao
Shraddha Mane
AAML
26
11
0
21 Jun 2021
Adversarial Examples Make Strong Poisons
Adversarial Examples Make Strong Poisons
Liam H. Fowl
Micah Goldblum
Ping Yeh-Chiang
Jonas Geiping
Wojtek Czaja
Tom Goldstein
SILM
32
132
0
21 Jun 2021
CPM-2: Large-scale Cost-effective Pre-trained Language Models
CPM-2: Large-scale Cost-effective Pre-trained Language Models
Zhengyan Zhang
Yuxian Gu
Xu Han
Shengqi Chen
Chaojun Xiao
...
Minlie Huang
Wentao Han
Yang Liu
Xiaoyan Zhu
Maosong Sun
MoE
37
86
0
20 Jun 2021
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
Shiwei Liu
Tianlong Chen
Xiaohan Chen
Zahra Atashgahi
Lu Yin
Huanyu Kou
Li Shen
Mykola Pechenizkiy
Zhangyang Wang
Decebal Constantin Mocanu
40
111
0
19 Jun 2021
Process for Adapting Language Models to Society (PALMS) with
  Values-Targeted Datasets
Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets
Irene Solaiman
Christy Dennison
30
222
0
18 Jun 2021
Label prompt for multi-label text classification
Label prompt for multi-label text classification
Rui Song
Xingbing Chen
Zelong Liu
Haining An
Zhiqi Zhang
Xiaoguang Wang
Hao Xu
VLM
28
4
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation
  Learning
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
37
209
0
17 Jun 2021
Localized Uncertainty Attacks
Localized Uncertainty Attacks
Ousmane Amadou Dia
Theofanis Karaletsos
C. Hazirbas
Cristian Canton Ferrer
I. Kabul
E. Meijer
AAML
24
2
0
17 Jun 2021
A Winning Hand: Compressing Deep Networks Can Improve
  Out-Of-Distribution Robustness
A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness
James Diffenderfer
Brian Bartoldson
Shreya Chaganti
Jize Zhang
B. Kailkhura
OOD
31
69
0
16 Jun 2021
Efficient Deep Learning: A Survey on Making Deep Learning Models
  Smaller, Faster, and Better
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
Gaurav Menghani
VLM
MedIm
23
366
0
16 Jun 2021
Question Answering Infused Pre-training of General-Purpose
  Contextualized Representations
Question Answering Infused Pre-training of General-Purpose Contextualized Representations
Robin Jia
M. Lewis
Luke Zettlemoyer
18
28
0
15 Jun 2021
An Analytical Theory of Curriculum Learning in Teacher-Student Networks
An Analytical Theory of Curriculum Learning in Teacher-Student Networks
Luca Saglietti
Stefano Sarao Mannelli
Andrew M. Saxe
27
25
0
15 Jun 2021
Simple GNN Regularisation for 3D Molecular Property Prediction & Beyond
Simple GNN Regularisation for 3D Molecular Property Prediction & Beyond
Jonathan Godwin
Michael Schaarschmidt
Alex Gaunt
Alvaro Sanchez-Gonzalez
Yulia Rubanova
Petar Velivcković
J. Kirkpatrick
Peter W. Battaglia
41
60
0
15 Jun 2021
Divergence Frontiers for Generative Models: Sample Complexity,
  Quantization Effects, and Frontier Integrals
Divergence Frontiers for Generative Models: Sample Complexity, Quantization Effects, and Frontier Integrals
Lang Liu
Krishna Pillutla
Sean Welleck
Sewoong Oh
Yejin Choi
Zaïd Harchaoui
MQ
28
14
0
15 Jun 2021
Previous
123...210211212...219220221
Next