ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXivPDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 11,640 papers shown
Title
Semantically Informed Slang Interpretation
Semantically Informed Slang Interpretation
Zhewei Sun
R. Zemel
Yang Xu
28
7
0
02 May 2022
Can Information Behaviour Inform Machine Learning?
Can Information Behaviour Inform Machine Learning?
M. Ridley
AI4CE
29
0
0
01 May 2022
EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language
  Processing
EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing
Chengyu Wang
Minghui Qiu
Chen Shi
Taolin Zhang
Tingting Liu
Lei Li
Rongxiang Weng
Ming Wang
Jun Huang
W. Lin
27
21
0
30 Apr 2022
HDGT: Heterogeneous Driving Graph Transformer for Multi-Agent Trajectory
  Prediction via Scene Encoding
HDGT: Heterogeneous Driving Graph Transformer for Multi-Agent Trajectory Prediction via Scene Encoding
Xiaosong Jia
Peng Wu
Li Chen
Yunxing Liu
Hongyang Li
Junchi Yan
32
122
0
30 Apr 2022
Building a Role Specified Open-Domain Dialogue System Leveraging
  Large-Scale Language Models
Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models
Sanghwan Bae
Donghyun Kwak
Sungdong Kim
Dong-hyun Ham
Soyoung Kang
Sang-Woo Lee
W. Park
ALM
30
37
0
30 Apr 2022
To Know by the Company Words Keep and What Else Lies in the Vicinity
To Know by the Company Words Keep and What Else Lies in the Vicinity
Jake Williams
H. Heidenreich
24
0
0
30 Apr 2022
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
MiCS: Near-linear Scaling for Training Gigantic Model on Public Cloud
Zhen Zhang
Shuai Zheng
Yida Wang
Justin Chiu
George Karypis
Trishul Chilimbi
Mu Li
Xin Jin
28
39
0
30 Apr 2022
Prompt Consistency for Zero-Shot Task Generalization
Prompt Consistency for Zero-Shot Task Generalization
Chunting Zhou
Junxian He
Xuezhe Ma
Taylor Berg-Kirkpatrick
Graham Neubig
VLM
26
74
0
29 Apr 2022
Handling and Presenting Harmful Text in NLP Research
Handling and Presenting Harmful Text in NLP Research
Hannah Rose Kirk
Abeba Birhane
Bertie Vidgen
Leon Derczynski
26
47
0
29 Apr 2022
TemporalWiki: A Lifelong Benchmark for Training and Evaluating
  Ever-Evolving Language Models
TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
Joel Jang
Seonghyeon Ye
Changho Lee
Sohee Yang
Joongbo Shin
Janghoon Han
Gyeonghun Kim
Minjoon Seo
CLL
KELM
27
93
0
29 Apr 2022
Flamingo: a Visual Language Model for Few-Shot Learning
Flamingo: a Visual Language Model for Few-Shot Learning
Jean-Baptiste Alayrac
Jeff Donahue
Pauline Luc
Antoine Miech
Iain Barr
...
Mikolaj Binkowski
Ricardo Barreira
Oriol Vinyals
Andrew Zisserman
Karen Simonyan
MLLM
VLM
51
3,369
0
29 Apr 2022
Training Language Models with Language Feedback
Training Language Models with Language Feedback
Jérémy Scheurer
Jon Ander Campos
Jun Shern Chan
Angelica Chen
Kyunghyun Cho
Ethan Perez
ALM
53
48
0
29 Apr 2022
On the Effect of Pretraining Corpora on In-context Learning by a
  Large-scale Language Model
On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model
Seongjin Shin
Sang-Woo Lee
Hwijeen Ahn
Sungdong Kim
Hyoungseok Kim
...
Kyunghyun Cho
Gichang Lee
W. Park
Jung-Woo Ha
Nako Sung
LRM
38
94
0
28 Apr 2022
Taylor Genetic Programming for Symbolic Regression
Taylor Genetic Programming for Symbolic Regression
Baihe He
Qiang Lu
Qingyun Yang
Jake Luo
Zhiguang Wang
37
29
0
28 Apr 2022
Towards Flexible Inference in Sequential Decision Problems via
  Bidirectional Transformers
Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers
Micah Carroll
Jessy Lin
Orr Paradise
Raluca Georgescu
Mingfei Sun
...
Stephanie Milani
Katja Hofmann
Matthew J. Hausknecht
Anca Dragan
Sam Devlin
OffRL
40
10
0
28 Apr 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
49
150
0
27 Apr 2022
Towards Teachable Reasoning Systems: Using a Dynamic Memory of User
  Feedback for Continual System Improvement
Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement
Bhavana Dalvi
Oyvind Tafjord
Peter Clark
LRM
KELM
ReLM
38
37
0
27 Apr 2022
Can deep learning match the efficiency of human visual long-term memory
  in storing object details?
Can deep learning match the efficiency of human visual long-term memory in storing object details?
Emin Orhan
VLM
OCL
28
0
0
27 Apr 2022
DearKD: Data-Efficient Early Knowledge Distillation for Vision
  Transformers
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
Xianing Chen
Qiong Cao
Yujie Zhong
Jing Zhang
Shenghua Gao
Dacheng Tao
ViT
40
76
0
27 Apr 2022
An End-to-End Dialogue Summarization System for Sales Calls
An End-to-End Dialogue Summarization System for Sales Calls
Abedelkadir Asi
Song Wang
Roy Eisenstadt
Dean Geckt
Yarin Kuper
Yi Mao
Royi Ronen
30
16
0
27 Apr 2022
Plug-and-Play Adaptation for Continuously-updated QA
Plug-and-Play Adaptation for Continuously-updated QA
Kyungjae Lee
Wookje Han
Seung-won Hwang
Hwaran Lee
Joonsuk Park
Sang-Woo Lee
KELM
30
16
0
27 Apr 2022
A Thorough Examination on Zero-shot Dense Retrieval
A Thorough Examination on Zero-shot Dense Retrieval
Ruiyang Ren
Yingqi Qu
Qingbin Liu
Wayne Xin Zhao
Qifei Wu
Yuchen Ding
Hua Wu
Haifeng Wang
Ji-Rong Wen
39
41
0
27 Apr 2022
On the Limitations of Dataset Balancing: The Lost Battle Against
  Spurious Correlations
On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Roy Schwartz
Gabriel Stanovsky
42
26
0
27 Apr 2022
Testing the Ability of Language Models to Interpret Figurative Language
Testing the Ability of Language Models to Interpret Figurative Language
Emmy Liu
Chenxuan Cui
Kenneth Zheng
Graham Neubig
ELM
LRM
25
65
0
26 Apr 2022
Landing AI on Networks: An equipment vendor viewpoint on Autonomous
  Driving Networks
Landing AI on Networks: An equipment vendor viewpoint on Autonomous Driving Networks
Dario Rossi
Liang Zhang
36
13
0
26 Apr 2022
Systematicity, Compositionality and Transitivity of Deep NLP Models: a
  Metamorphic Testing Perspective
Systematicity, Compositionality and Transitivity of Deep NLP Models: a Metamorphic Testing Perspective
Edoardo Manino
Julia Rozanova
Danilo S. Carvalho
André Freitas
Lucas C. Cordeiro
30
7
0
26 Apr 2022
Unsupervised Learning of Unbiased Visual Representations
Unsupervised Learning of Unbiased Visual Representations
C. Barbano
Enzo Tartaglione
Marco Grangetto
SSL
CML
OOD
39
1
0
26 Apr 2022
LM-Debugger: An Interactive Tool for Inspection and Intervention in
  Transformer-Based Language Models
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models
Mor Geva
Avi Caciularu
Guy Dar
Paul Roit
Shoval Sadde
Micah Shlain
Bar Tamir
Yoav Goldberg
KELM
35
27
0
26 Apr 2022
An Overview of Recent Work in Media Forensics: Methods and Threats
An Overview of Recent Work in Media Forensics: Methods and Threats
Kratika Bhagtani
A. Yadav
Emily R. Bartusiak
Ziyue Xiang
Ruiting Shao
Sriram Baireddy
Edward J. Delp
AAML
55
25
0
26 Apr 2022
Super-Prompting: Utilizing Model-Independent Contextual Data to Reduce
  Data Annotation Required in Visual Commonsense Tasks
Super-Prompting: Utilizing Model-Independent Contextual Data to Reduce Data Annotation Required in Visual Commonsense Tasks
Navid Rezaei
Marek Reformat
VLM
17
2
0
25 Apr 2022
Masked Image Modeling Advances 3D Medical Image Analysis
Masked Image Modeling Advances 3D Medical Image Analysis
Zekai Chen
Devansh Agarwal
Kshitij Aggarwal
Wiem Safta
Samit Hirawat
V. Sethuraman
Mariann Micsinai Balan
Kevin Brown
33
69
0
25 Apr 2022
Natural Language to Code Translation with Execution
Natural Language to Code Translation with Execution
Freda Shi
Daniel Fried
Marjan Ghazvininejad
Luke Zettlemoyer
Sida I. Wang
40
124
0
25 Apr 2022
Can Foundation Models Perform Zero-Shot Task Specification For Robot
  Manipulation?
Can Foundation Models Perform Zero-Shot Task Specification For Robot Manipulation?
Yuchen Cui
S. Niekum
Abhi Gupta
Vikash Kumar
Aravind Rajeswaran
LM&Ro
30
74
0
23 Apr 2022
Data Distributional Properties Drive Emergent In-Context Learning in
  Transformers
Data Distributional Properties Drive Emergent In-Context Learning in Transformers
Stephanie C. Y. Chan
Adam Santoro
Andrew Kyle Lampinen
Jane X. Wang
Aaditya K. Singh
Pierre Harvey Richemond
J. Mcclelland
Felix Hill
81
249
0
22 Apr 2022
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better
  than Dot-Product Self-Attention
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention
Tong Yu
Ruslan Khalitov
Lei Cheng
Zhirong Yang
MoE
27
10
0
22 Apr 2022
Autoregressive Search Engines: Generating Substrings as Document
  Identifiers
Autoregressive Search Engines: Generating Substrings as Document Identifiers
Michele Bevilacqua
G. Ottaviano
Patrick Lewis
Wen-tau Yih
Sebastian Riedel
Fabio Petroni
KELM
RALM
42
156
0
22 Apr 2022
KALA: Knowledge-Augmented Language Model Adaptation
KALA: Knowledge-Augmented Language Model Adaptation
Minki Kang
Jinheon Baek
Sung Ju Hwang
VLM
KELM
36
34
0
22 Apr 2022
Zero and Few-shot Learning for Author Profiling
Zero and Few-shot Learning for Author Profiling
Mara Chinea-Rios
Thomas Müller
Gretel Liz De la Pena Sarracén
Francisco Rangel
Marc Franco-Salvador
31
14
0
22 Apr 2022
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for
  Vision-Language Tasks
Multimodal Adaptive Distillation for Leveraging Unimodal Encoders for Vision-Language Tasks
Zhecan Wang
Noel Codella
Yen-Chun Chen
Luowei Zhou
Xiyang Dai
...
Jianwei Yang
Haoxuan You
Kai-Wei Chang
Shih-Fu Chang
Lu Yuan
VLM
OffRL
31
22
0
22 Apr 2022
Decorate the Examples: A Simple Method of Prompt Design for Biomedical
  Relation Extraction
Decorate the Examples: A Simple Method of Prompt Design for Biomedical Relation Extraction
Hui-Syuan Yeh
Thomas Lavergne
Pierre Zweigenbaum
26
10
0
21 Apr 2022
Residual Mixture of Experts
Residual Mixture of Experts
Lemeng Wu
Mengchen Liu
Yinpeng Chen
Dongdong Chen
Xiyang Dai
Lu Yuan
MoE
27
36
0
20 Apr 2022
You Are What You Write: Preserving Privacy in the Era of Large Language
  Models
You Are What You Write: Preserving Privacy in the Era of Large Language Models
Richard Plant
V. Giuffrida
Dimitra Gkatzia
PILM
40
19
0
20 Apr 2022
CodexDB: Generating Code for Processing SQL Queries using GPT-3 Codex
CodexDB: Generating Code for Processing SQL Queries using GPT-3 Codex
Immanuel Trummer
LMTD
29
19
0
19 Apr 2022
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented
  Visual Models
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Chunyuan Li
Haotian Liu
Liunian Harold Li
Pengchuan Zhang
J. Aneja
...
Ping Jin
Houdong Hu
Zicheng Liu
Yong Jae Lee
Jianfeng Gao
50
145
0
19 Apr 2022
Where Was COVID-19 First Discovered? Designing a Question-Answering
  System for Pandemic Situations
Where Was COVID-19 First Discovered? Designing a Question-Answering System for Pandemic Situations
Johannes Graf
G. Lancho
Patrick Zschech
Kai Heinrich
27
3
0
19 Apr 2022
UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles
  for Detecting Patronizing and Condescending Language
UMass PCL at SemEval-2022 Task 4: Pre-trained Language Model Ensembles for Detecting Patronizing and Condescending Language
David Koleczek
Alexander Scarlatos
Siddha Makarand Karkare
Preshma Linet Pereira
29
0
0
18 Apr 2022
Empirical Evaluation and Theoretical Analysis for Representation
  Learning: A Survey
Empirical Evaluation and Theoretical Analysis for Representation Learning: A Survey
Kento Nozawa
Issei Sato
AI4TS
29
4
0
18 Apr 2022
Simultaneous Multiple-Prompt Guided Generation Using Differentiable
  Optimal Transport
Simultaneous Multiple-Prompt Guided Generation Using Differentiable Optimal Transport
Yingtao Tian
Marco Cuturi
David R Ha
DiffM
OT
46
1
0
18 Apr 2022
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Bill Yuchen Lin
Kangmin Tan
Chris Miller
Beiwen Tian
Xiang Ren
LRM
RALM
32
48
0
17 Apr 2022
On the Origin of Hallucinations in Conversational Models: Is it the
  Datasets or the Models?
On the Origin of Hallucinations in Conversational Models: Is it the Datasets or the Models?
Nouha Dziri
Sivan Milton
Mo Yu
Osmar Zaiane
Siva Reddy
HILM
19
188
0
17 Apr 2022
Previous
123...202203204...231232233
Next