ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,373 papers shown
Title
A Transformer Architecture for Online Gesture Recognition of
  Mathematical Expressions
A Transformer Architecture for Online Gesture Recognition of Mathematical Expressions
Mirco Ramo
Guénolé Silvestre
45
1
0
04 Nov 2022
Continuous Prompt Tuning Based Textual Entailment Model for E-commerce
  Entity Typing
Continuous Prompt Tuning Based Textual Entailment Model for E-commerce Entity Typing
Yibo Wang
Congying Xia
Guan Wang
Philip Yu
55
6
0
04 Nov 2022
A General Purpose Neural Architecture for Geospatial Systems
A General Purpose Neural Architecture for Geospatial Systems
Nasim Rahaman
Martin Weiss
Frederik Trauble
Francesco Locatello
Alexandre Lacoste
Yoshua Bengio
C. Pal
Li Erran Li
Bernhard Schölkopf
AI4TSAI4CE
55
6
0
04 Nov 2022
Experiences from Using Code Explanations Generated by Large Language
  Models in a Web Software Development E-Book
Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book
Stephen MacNeil
Andrew Tran
Arto Hellas
Joanne Kim
Sami Sarsa
Paul Denny
Seth Bernstein
Juho Leinonen
107
190
0
04 Nov 2022
Unintended Memorization and Timing Attacks in Named Entity Recognition
  Models
Unintended Memorization and Timing Attacks in Named Entity Recognition Models
Rana Salal Ali
Benjamin Zi Hao Zhao
Hassan Jameel Asghar
Tham Nguyen
Ian D. Wood
Dali Kaafar
AAML
56
3
0
04 Nov 2022
Federated Multilingual Models for Medical Transcript Analysis
Federated Multilingual Models for Medical Transcript Analysis
Andre Manoel
Mirian Hipolito Garcia
Tal Baumel
Shize Su
Jialei Chen
Dan Miller
D. Karmon
Robert Sim
Dimitrios Dimitriadis
61
13
0
04 Nov 2022
Hardware/Software co-design with ADC-Less In-memory Computing Hardware
  for Spiking Neural Networks
Hardware/Software co-design with ADC-Less In-memory Computing Hardware for Spiking Neural Networks
M. Apolinario
Adarsh Kosta
Utkarsh Saxena
Kaushik Roy
57
7
0
03 Nov 2022
Time-aware Prompting for Text Generation
Time-aware Prompting for Text Generation
Shuyang Cao
Lu Wang
70
12
0
03 Nov 2022
Overcoming Barriers to Skill Injection in Language Modeling: Case Study
  in Arithmetic
Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic
Mandar Sharma
Nikhil Muralidhar
Naren Ramakrishnan
53
6
0
03 Nov 2022
LMentry: A Language Model Benchmark of Elementary Language Tasks
LMentry: A Language Model Benchmark of Elementary Language Tasks
Avia Efrat
Or Honovich
Omer Levy
104
20
0
03 Nov 2022
Could Giant Pretrained Image Models Extract Universal Representations?
Could Giant Pretrained Image Models Extract Universal Representations?
Yutong Lin
Ze Liu
Zheng Zhang
Han Hu
Nanning Zheng
Stephen Lin
Yue Cao
VLM
106
9
0
03 Nov 2022
Inverse scaling can become U-shaped
Inverse scaling can become U-shaped
Jason W. Wei
Najoung Kim
Yi Tay
Quoc V. Le
LRM
108
64
0
03 Nov 2022
Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global
  Weather Forecast
Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast
Kaifeng Bi
Lingxi Xie
Hengheng Zhang
Xin Chen
Xiaotao Gu
Qi Tian
AI4Cl
87
171
0
03 Nov 2022
Large Language Models Are Human-Level Prompt Engineers
Large Language Models Are Human-Level Prompt Engineers
Yongchao Zhou
Andrei Ioan Muresanu
Ziwen Han
Keiran Paster
Silviu Pitis
Harris Chan
Jimmy Ba
ALMLLMAG
195
904
0
03 Nov 2022
Latent Prompt Tuning for Text Summarization
Latent Prompt Tuning for Text Summarization
Yubo Zhang
Xingxing Zhang
Xun Wang
Si-Qing Chen
Furu Wei
VLM
95
12
0
03 Nov 2022
Iterative autoregression: a novel trick to improve your low-latency
  speech enhancement model
Iterative autoregression: a novel trick to improve your low-latency speech enhancement model
Pavel Andreev
Nicholas Babaev
Azat Saginbaev
Ivan Shchekotov
Aibek Alanov
81
5
0
03 Nov 2022
Using Large Pre-Trained Language Model to Assist FDA in Premarket
  Medical Device
Using Large Pre-Trained Language Model to Assist FDA in Premarket Medical Device
Zongzhe Xu
LM&MAMedIm
64
0
0
03 Nov 2022
PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
Peifeng Wang
Aaron Chan
Filip Ilievski
Muhao Chen
Xiang Ren
LRMReLM
117
65
0
03 Nov 2022
Generative Entity-to-Entity Stance Detection with Knowledge Graph
  Augmentation
Generative Entity-to-Entity Stance Detection with Knowledge Graph Augmentation
Xinliang Frederick Zhang
Nick Beauchamp
Lu Wang
54
10
0
02 Nov 2022
MPCFormer: fast, performant and private Transformer inference with MPC
MPCFormer: fast, performant and private Transformer inference with MPC
Dacheng Li
Rulin Shao
Hongyi Wang
Han Guo
Eric P. Xing
Haotong Zhang
92
87
0
02 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLMMoE
213
832
0
02 Nov 2022
Analysis of Noisy-target Training for DNN-based speech enhancement
Analysis of Noisy-target Training for DNN-based speech enhancement
Takuya Fujimura
Tomoki Toda
62
6
0
02 Nov 2022
Neural Systematic Binder
Neural Systematic Binder
Gautam Singh
Yeongbin Kim
Sungjin Ahn
OCL
114
37
0
02 Nov 2022
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary
  Object Detection
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection
Yanxin Long
Jianhua Han
Runhu Huang
Xu Hang
Yi Zhu
Chunjing Xu
Xiaodan Liang
VLMObjD
104
19
0
02 Nov 2022
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Yosuke Higuchi
Tetsuji Ogawa
Tetsunori Kobayashi
Shinji Watanabe
169
13
0
02 Nov 2022
Two-stage LLM Fine-tuning with Less Specialization and More
  Generalization
Two-stage LLM Fine-tuning with Less Specialization and More Generalization
Yihan Wang
Si Si
Daliang Li
Michal Lukasik
Felix X. Yu
Cho-Jui Hsieh
Inderjit S Dhillon
Sanjiv Kumar
137
30
0
01 Nov 2022
Interpretability in the Wild: a Circuit for Indirect Object
  Identification in GPT-2 small
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
320
563
0
01 Nov 2022
ClassActionPrediction: A Challenging Benchmark for Legal Judgment
  Prediction of Class Action Cases in the US
ClassActionPrediction: A Challenging Benchmark for Legal Judgment Prediction of Class Action Cases in the US
Gil Semo
Dor Bernsohn
Ben Hagag
Gila Hayat
Joel Niklaus
AILawELM
97
20
0
01 Nov 2022
VarMAE: Pre-training of Variational Masked Autoencoder for
  Domain-adaptive Language Understanding
VarMAE: Pre-training of Variational Masked Autoencoder for Domain-adaptive Language Understanding
Dou Hu
Xiaolong Hou
Xiyang Du
Mengyuan Zhou
Lian-Xin Jiang
Yang Mo
Xiaofeng Shi
97
13
0
01 Nov 2022
A General Search-based Framework for Generating Textual Counterfactual
  Explanations
A General Search-based Framework for Generating Textual Counterfactual Explanations
Daniel Gilo
Shaul Markovitch
LRM
90
0
0
01 Nov 2022
CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about
  Negation
CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation
Abhilasha Ravichander
Matt Gardner
Ana Marasović
110
35
0
01 Nov 2022
A Close Look into the Calibration of Pre-trained Language Models
A Close Look into the Calibration of Pre-trained Language Models
Yangyi Chen
Lifan Yuan
Ganqu Cui
Zhiyuan Liu
Heng Ji
153
52
0
31 Oct 2022
Generating Sequences by Learning to Self-Correct
Generating Sequences by Learning to Self-Correct
Sean Welleck
Ximing Lu
Peter West
Faeze Brahman
T. Shen
Daniel Khashabi
Yejin Choi
LRM
111
238
0
31 Oct 2022
Zero-Shot Text Classification with Self-Training
Zero-Shot Text Classification with Self-Training
Ariel Gera
Alon Halfon
Eyal Shnarch
Yotam Perlitz
L. Ein-Dor
Noam Slonim
VLM
76
62
0
31 Oct 2022
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning
Yaqing Wang
Sahaj Agarwal
Subhabrata Mukherjee
Xiaodong Liu
Jing Gao
Ahmed Hassan Awadallah
Jianfeng Gao
MoE
109
136
0
31 Oct 2022
Learning New Tasks from a Few Examples with Soft-Label Prototypes
Learning New Tasks from a Few Examples with Soft-Label Prototypes
Avyav Kumar Singh
Ekaterina Shutova
H. Yannakoudakis
VLM
89
0
0
31 Oct 2022
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for
  Text Generation and Modular Control
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Xiaochuang Han
Sachin Kumar
Yulia Tsvetkov
165
91
0
31 Oct 2022
A Simple, Yet Effective Approach to Finding Biases in Code Generation
A Simple, Yet Effective Approach to Finding Biases in Code Generation
Spyridon Mouselinos
Mateusz Malinowski
Henryk Michalewski
110
9
0
31 Oct 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained
  Transformers
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
194
1,013
0
31 Oct 2022
QNet: A Quantum-native Sequence Encoder Architecture
QNet: A Quantum-native Sequence Encoder Architecture
Wei-Yen Day
Hao-Sheng Chen
Min Sun
96
0
0
31 Oct 2022
Pneg: Prompt-based Negative Response Generation for Dialogue Response
  Selection Task
Pneg: Prompt-based Negative Response Generation for Dialogue Response Selection Task
Nyoungwoo Lee
Yujin Baek
Ho-Jin Choi
Jaegul Choo
76
6
0
31 Oct 2022
When Language Model Meets Private Library
When Language Model Meets Private Library
Daoguang Zan
Bei Chen
Zeqi Lin
Bei Guan
Yongji Wang
Jian-Guang Lou
ALM
134
74
0
31 Oct 2022
Automated Dominative Subspace Mining for Efficient Neural Architecture
  Search
Automated Dominative Subspace Mining for Efficient Neural Architecture Search
Yaofo Chen
Yong Guo
Daihai Liao
Fanbing Lv
Hengjie Song
James Tin-Yau Kwok
Mingkui Tan
88
4
0
31 Oct 2022
Scoring Black-Box Models for Adversarial Robustness
Scoring Black-Box Models for Adversarial Robustness
Jian Vora
Pranay Reddy Samala
68
0
0
31 Oct 2022
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
QuaLA-MiniLM: a Quantized Length Adaptive MiniLM
Shira Guskin
Moshe Wasserblat
Chang Wang
Haihao Shen
MQ
76
2
0
31 Oct 2022
GPS: Genetic Prompt Search for Efficient Few-shot Learning
GPS: Genetic Prompt Search for Efficient Few-shot Learning
Hanwei Xu
Yujun Chen
Yulun Du
Nan Shao
Yanggang Wang
Haiyu Li
Zhilin Yang
VLM
63
31
0
31 Oct 2022
Poison Attack and Defense on Deep Source Code Processing Models
Poison Attack and Defense on Deep Source Code Processing Models
Jia Li
Zhuo Li
Huangzhao Zhang
Ge Li
Zhi Jin
Xing Hu
Xin Xia
AAML
69
19
0
31 Oct 2022
XMD: An End-to-End Framework for Interactive Explanation-Based Debugging
  of NLP Models
XMD: An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models
Dong-Ho Lee
Akshen Kadakia
Brihi Joshi
Aaron Chan
Ziyi Liu
...
Takashi Shibuya
Ryosuke Mitani
Toshiyuki Sekiya
Jay Pujara
Xiang Ren
LRM
79
9
0
30 Oct 2022
DiffusER: Discrete Diffusion via Edit-based Reconstruction
DiffusER: Discrete Diffusion via Edit-based Reconstruction
Machel Reid
Vincent J. Hellendoorn
Graham Neubig
DiffM
123
42
0
30 Oct 2022
Learning to Decompose: Hypothetical Question Decomposition Based on
  Comparable Texts
Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts
Ben Zhou
Kyle Richardson
Xiaodong Yu
Dan Roth
ReLM
101
22
0
30 Oct 2022
Previous
123...177178179...246247248
Next