ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXivPDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 11,497 papers shown
Title
Automated Identification of Toxic Code Reviews Using ToxiCR
Automated Identification of Toxic Code Reviews Using ToxiCR
Jaydeb Sarker
Asif Kamal Turzo
Mingyou Dong
Amiangshu Bosu
27
31
0
26 Feb 2022
Combining Observational and Randomized Data for Estimating Heterogeneous
  Treatment Effects
Combining Observational and Randomized Data for Estimating Heterogeneous Treatment Effects
Tobias Hatt
Jeroen Berrevoets
Alicia Curth
Stefan Feuerriegel
M. Schaar
CML
52
29
0
25 Feb 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning
  Work?
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?
Sewon Min
Xinxi Lyu
Ari Holtzman
Mikel Artetxe
M. Lewis
Hannaneh Hajishirzi
Luke Zettlemoyer
LLMAG
LRM
76
1,409
0
25 Feb 2022
TrimBERT: Tailoring BERT for Trade-offs
TrimBERT: Tailoring BERT for Trade-offs
S. N. Sridhar
Anthony Sarah
Sairam Sundaresan
MQ
29
4
0
24 Feb 2022
Probing BERT's priors with serial reproduction chains
Probing BERT's priors with serial reproduction chains
Takateru Yamakoshi
Thomas Griffiths
Robert D. Hawkins
29
12
0
24 Feb 2022
Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing?
  A Structured Review
Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review
Kyle Hamilton
Aparna Nayak
Bojan Bozic
Luca Longo
NAI
31
57
0
24 Feb 2022
Transformers in Medical Image Analysis: A Review
Transformers in Medical Image Analysis: A Review
Kelei He
Chen Gan
Zhuoyuan Li
I. Rekik
Zihao Yin
Wen Ji
Yang Gao
Qian Wang
Junfeng Zhang
Dinggang Shen
ViT
MedIm
28
255
0
24 Feb 2022
Pretraining without Wordpieces: Learning Over a Vocabulary of Millions
  of Words
Pretraining without Wordpieces: Learning Over a Vocabulary of Millions of Words
Zhangyin Feng
Duyu Tang
Cong Zhou
Junwei Liao
Shuangzhi Wu
Xiaocheng Feng
Bing Qin
Yunbo Cao
Shuming Shi
VLM
28
9
0
24 Feb 2022
From Natural Language to Simulations: Applying GPT-3 Codex to Automate
  Simulation Modeling of Logistics Systems
From Natural Language to Simulations: Applying GPT-3 Codex to Automate Simulation Modeling of Logistics Systems
I. Jackson
M. J. Sáenz
16
8
0
24 Feb 2022
NoisyTune: A Little Noise Can Help You Finetune Pretrained Language
  Models Better
NoisyTune: A Little Noise Can Help You Finetune Pretrained Language Models Better
Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang
Xing Xie
25
58
0
24 Feb 2022
Sky Computing: Accelerating Geo-distributed Computing in Federated
  Learning
Sky Computing: Accelerating Geo-distributed Computing in Federated Learning
Jie Zhu
Shenggui Li
Yang You
FedML
16
5
0
24 Feb 2022
Using natural language prompts for machine translation
Using natural language prompts for machine translation
Xavier Garcia
Orhan Firat
AI4CE
30
30
0
23 Feb 2022
COLD Decoding: Energy-based Constrained Text Generation with Langevin
  Dynamics
COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics
Lianhui Qin
Sean Welleck
Daniel Khashabi
Yejin Choi
AI4CE
58
144
0
23 Feb 2022
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified
  Multilingual Prompt
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Lianzhe Huang
Shuming Ma
Dongdong Zhang
Furu Wei
Houfeng Wang
VLM
LRM
26
32
0
23 Feb 2022
Prompt-Learning for Short Text Classification
Prompt-Learning for Short Text Classification
Yi Zhu
Xinke Zhou
Jipeng Qiang
Yun Li
Yunhao Yuan
Xindong Wu
VLM
18
34
0
23 Feb 2022
Memory Planning for Deep Neural Networks
Memory Planning for Deep Neural Networks
Maksim Levental
33
4
0
23 Feb 2022
Blockchain Framework for Artificial Intelligence Computation
Blockchain Framework for Artificial Intelligence Computation
Jie You
6
8
0
23 Feb 2022
Evaluating Feature Attribution Methods in the Image Domain
Evaluating Feature Attribution Methods in the Image Domain
Arne Gevaert
Axel-Jan Rousseau
Thijs Becker
D. Valkenborg
T. D. Bie
Yvan Saeys
FAtt
27
22
0
22 Feb 2022
Generating Videos with Dynamics-aware Implicit Generative Adversarial
  Networks
Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks
Sihyun Yu
Jihoon Tack
Sangwoo Mo
Hyunsu Kim
Junho Kim
Jung-Woo Ha
Jinwoo Shin
DiffM
VGen
42
199
0
21 Feb 2022
Transformer Quality in Linear Time
Transformer Quality in Linear Time
Weizhe Hua
Zihang Dai
Hanxiao Liu
Quoc V. Le
81
221
0
21 Feb 2022
Items from Psychometric Tests as Training Data for Personality Profiling
  Models of Twitter Users
Items from Psychometric Tests as Training Data for Personality Profiling Models of Twitter Users
Anne Kreuter
Kai Sassenberg
Roman Klinger
21
6
0
21 Feb 2022
Ligandformer: A Graph Neural Network for Predicting Compound Property
  with Robust Interpretation
Ligandformer: A Graph Neural Network for Predicting Compound Property with Robust Interpretation
Jinjiang Guo
Qi Liu
Han Guo
Xi Lu
AI4CE
24
3
0
21 Feb 2022
GPT-based Open-Ended Knowledge Tracing
GPT-based Open-Ended Knowledge Tracing
Naiming Liu
Zichao Wang
Richard G. Baraniuk
Andrew S. Lan
AI4Ed
32
3
0
21 Feb 2022
Deconstructing Distributions: A Pointwise Framework of Learning
Deconstructing Distributions: A Pointwise Framework of Learning
Gal Kaplun
Nikhil Ghosh
Saurabh Garg
Boaz Barak
Preetum Nakkiran
OOD
33
21
0
20 Feb 2022
$\mathcal{Y}$-Tuning: An Efficient Tuning Paradigm for Large-Scale
  Pre-Trained Models via Label Representation Learning
Y\mathcal{Y}Y-Tuning: An Efficient Tuning Paradigm for Large-Scale Pre-Trained Models via Label Representation Learning
Yitao Liu
Chen An
Xipeng Qiu
29
17
0
20 Feb 2022
Visual Attention Network
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
24
638
0
20 Feb 2022
COMPASS: Contrastive Multimodal Pretraining for Autonomous Systems
COMPASS: Contrastive Multimodal Pretraining for Autonomous Systems
Shuang Ma
Sai H. Vemprala
Wenshan Wang
Jayesh K. Gupta
Yale Song
Daniel J. McDuff
Ashish Kapoor
SSL
37
9
0
20 Feb 2022
Do Transformers know symbolic rules, and would we know if they did?
Do Transformers know symbolic rules, and would we know if they did?
Tommi Gröndahl
Yu-Wen Guo
Nirmal Asokan
30
0
0
19 Feb 2022
TransDreamer: Reinforcement Learning with Transformer World Models
TransDreamer: Reinforcement Learning with Transformer World Models
Changgu Chen
Yi-Fu Wu
Jaesik Yoon
Sungjin Ahn
OffRL
32
91
0
19 Feb 2022
Synthetic Disinformation Attacks on Automated Fact Verification Systems
Synthetic Disinformation Attacks on Automated Fact Verification Systems
Y. Du
Antoine Bosselut
Christopher D. Manning
AAML
OffRL
36
32
0
18 Feb 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
331
0
18 Feb 2022
A Survey of Vision-Language Pre-Trained Models
A Survey of Vision-Language Pre-Trained Models
Yifan Du
Zikang Liu
Junyi Li
Wayne Xin Zhao
VLM
42
180
0
18 Feb 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
24
183
0
17 Feb 2022
cosFormer: Rethinking Softmax in Attention
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
38
212
0
17 Feb 2022
Revisiting Over-smoothing in BERT from the Perspective of Graph
Revisiting Over-smoothing in BERT from the Perspective of Graph
Han Shi
Jiahui Gao
Hang Xu
Xiaodan Liang
Zhenguo Li
Lingpeng Kong
Stephen M. S. Lee
James T. Kwok
24
71
0
17 Feb 2022
Open-Ended Reinforcement Learning with Neural Reward Functions
Open-Ended Reinforcement Learning with Neural Reward Functions
Robert Meier
Asier Mujika
37
7
0
16 Feb 2022
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
Guanzheng Chen
Fangyu Liu
Zaiqiao Meng
Shangsong Liang
26
88
0
16 Feb 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq
  Generation
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Tao Ge
Si-Qing Chen
Furu Wei
MoE
32
20
0
16 Feb 2022
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
ZeroGen: Efficient Zero-shot Learning via Dataset Generation
Jiacheng Ye
Jiahui Gao
Qintong Li
Hang Xu
Jiangtao Feng
Zhiyong Wu
Tao Yu
Lingpeng Kong
SyDa
47
212
0
16 Feb 2022
General-purpose, long-context autoregressive modeling with Perceiver AR
General-purpose, long-context autoregressive modeling with Perceiver AR
Curtis Hawthorne
Andrew Jaegle
Cătălina Cangea
Sebastian Borgeaud
C. Nash
...
Hannah R. Sheahan
Neil Zeghidour
Jean-Baptiste Alayrac
João Carreira
Jesse Engel
43
65
0
15 Feb 2022
Better Together? An Evaluation of AI-Supported Code Translation
Better Together? An Evaluation of AI-Supported Code Translation
Justin D. Weisz
Michael J. Muller
Steven I. Ross
Fernando Martinez
Stephanie Houde
Mayank Agarwal
Kartik Talamadupula
John T. Richards
29
67
0
15 Feb 2022
P4E: Few-Shot Event Detection as Prompt-Guided Identification and
  Localization
P4E: Few-Shot Event Detection as Prompt-Guided Identification and Localization
Sha Li
Liyuan Liu
Yiqing Xie
Heng Ji
Jiawei Han
41
4
0
15 Feb 2022
Impact of Pretraining Term Frequencies on Few-Shot Reasoning
Impact of Pretraining Term Frequencies on Few-Shot Reasoning
Yasaman Razeghi
Robert L Logan IV
Matt Gardner
Sameer Singh
ReLM
LRM
32
150
0
15 Feb 2022
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training
  Benchmark
Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark
Jiaxi Gu
Xiaojun Meng
Guansong Lu
Lu Hou
Minzhe Niu
...
Runhu Huang
Wei Zhang
Xingda Jiang
Chunjing Xu
Hang Xu
VLM
43
88
0
14 Feb 2022
Deduplicating Training Data Mitigates Privacy Risks in Language Models
Deduplicating Training Data Mitigates Privacy Risks in Language Models
Nikhil Kandpal
Eric Wallace
Colin Raffel
PILM
MU
54
275
0
14 Feb 2022
Transformer-based Approaches for Legal Text Processing
Transformer-based Approaches for Legal Text Processing
Nguyen Ha Thanh
Phuong Minh Nguyen
Thi-Hai-Yen Vuong
Minh Q. Bui
Minh-Chau Nguyen
Binh Dang
Vu Tran
Le-Minh Nguyen
Kenji Satoh
AILaw
24
12
0
13 Feb 2022
Scaling Laws Under the Microscope: Predicting Transformer Performance
  from Small Scale Experiments
Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments
Maor Ivgi
Y. Carmon
Jonathan Berant
19
17
0
13 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
Semantic-Oriented Unlabeled Priming for Large-Scale Language Models
Semantic-Oriented Unlabeled Priming for Large-Scale Language Models
Yanchen Liu
Timo Schick
Hinrich Schütze
VLM
33
15
0
12 Feb 2022
Maximizing Communication Efficiency for Large-scale Training via 0/1
  Adam
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Yucheng Lu
Conglong Li
Minjia Zhang
Christopher De Sa
Yuxiong He
OffRL
AI4CE
26
20
0
12 Feb 2022
Previous
123...205206207...228229230
Next