ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,431 papers shown
Title
Tokenization Consistency Matters for Generative Models on Extractive NLP
  Tasks
Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks
Kaiser Sun
Peng Qi
Yuhao Zhang
Lan Liu
William Yang Wang
Zhiheng Huang
80
9
0
19 Dec 2022
Inducing Character-level Structure in Subword-based Language Models with
  Type-level Interchange Intervention Training
Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training
Jing-ling Huang
Zhengxuan Wu
Kyle Mahowald
Christopher Potts
83
14
0
19 Dec 2022
Improved Long-Form Spoken Language Translation with Large Language
  Models
Improved Long-Form Spoken Language Translation with Large Language Models
Arya D. McCarthy
Haotong Zhang
Shankar Kumar
Felix Stahlberg
Axel H. Ng
71
2
0
19 Dec 2022
A Comparative Study on Textual Saliency of Styles from Eye Tracking,
  Annotations, and Language Models
A Comparative Study on Textual Saliency of Styles from Eye Tracking, Annotations, and Language Models
Karin de Langis
Dongyeop Kang
103
1
0
19 Dec 2022
Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
Z-ICL: Zero-Shot In-Context Learning with Pseudo-Demonstrations
Xinxi Lyu
Sewon Min
Iz Beltagy
Luke Zettlemoyer
Hannaneh Hajishirzi
VLM
75
68
0
19 Dec 2022
Synthetic Pre-Training Tasks for Neural Machine Translation
Synthetic Pre-Training Tasks for Neural Machine Translation
Zexue He
Graeme W. Blackwood
Yikang Shen
Julian McAuley
Rogerio Feris
54
4
0
19 Dec 2022
Training Trajectories of Language Models Across Scales
Training Trajectories of Language Models Across Scales
Mengzhou Xia
Mikel Artetxe
Chunting Zhou
Xi Lin
Ramakanth Pasunuru
Danqi Chen
Luke Zettlemoyer
Ves Stoyanov
AIFinLRM
98
64
0
19 Dec 2022
Scalable Diffusion Models with Transformers
Scalable Diffusion Models with Transformers
William S. Peebles
Saining Xie
GNN
175
2,440
0
19 Dec 2022
Evaluating Human-Language Model Interaction
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MAALM
108
102
0
19 Dec 2022
DSI++: Updating Transformer Memory with New Documents
DSI++: Updating Transformer Memory with New Documents
Sanket Vaibhav Mehta
Jai Gupta
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
J. Rao
Marc Najork
Emma Strubell
Donald Metzler
CLL
103
46
0
19 Dec 2022
Don't Generate, Discriminate: A Proposal for Grounding Language Models
  to Real-World Environments
Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
Yu Gu
Xiang Deng
Yu-Chuan Su
LLMAG
123
58
0
19 Dec 2022
A Retrieve-and-Read Framework for Knowledge Graph Link Prediction
A Retrieve-and-Read Framework for Knowledge Graph Link Prediction
Vardaan Pahuja
Boshi Wang
Hugo Latapie
Jayanth Srinivasa
Yu-Chuan Su
76
13
0
19 Dec 2022
On Event Individuation for Document-Level Information Extraction
On Event Individuation for Document-Level Information Extraction
William Gantt
Reno Kriz
Yunmo Chen
Siddharth Vashishtha
Aaron Steven White
69
2
0
19 Dec 2022
Unnatural Instructions: Tuning Language Models with (Almost) No Human
  Labor
Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
Or Honovich
Thomas Scialom
Omer Levy
Timo Schick
ALM
167
374
0
19 Dec 2022
Multilingual Sequence-to-Sequence Models for Hebrew NLP
Multilingual Sequence-to-Sequence Models for Hebrew NLP
Matan Eyal
Hila Noga
Roee Aharoni
Idan Szpektor
Reut Tsarfaty
47
4
0
19 Dec 2022
StyleFlow: Disentangle Latent Representations via Normalizing Flow for
  Unsupervised Text Style Transfer
StyleFlow: Disentangle Latent Representations via Normalizing Flow for Unsupervised Text Style Transfer
Kangchen Zhu
Zhiliang Tian
Ruifeng Luo
Xiaoguang Mao
OOD
105
3
0
19 Dec 2022
Visconde: Multi-document QA with GPT-3 and Neural Reranking
Visconde: Multi-document QA with GPT-3 and Neural Reranking
Jayr Pereira
R. Fidalgo
R. Lotufo
Rodrigo Nogueira
BDLRALM
78
33
0
19 Dec 2022
Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages
Cross-Lingual Retrieval Augmented Prompt for Low-Resource Languages
Ercong Nie
Sheng Liang
Helmut Schmid
Hinrich Schütze
VLMRALMLRM
114
22
0
19 Dec 2022
Optimizing Prompts for Text-to-Image Generation
Optimizing Prompts for Text-to-Image Generation
Y. Hao
Zewen Chi
Li Dong
Furu Wei
125
152
0
19 Dec 2022
Explanation Regeneration via Information Bottleneck
Explanation Regeneration via Information Bottleneck
Qintong Li
Zhiyong Wu
Lingpeng Kong
Wei Bi
93
4
0
19 Dec 2022
Reasoning with Language Model Prompting: A Survey
Reasoning with Language Model Prompting: A Survey
Shuofei Qiao
Yixin Ou
Ningyu Zhang
Xiang Chen
Yunzhi Yao
Shumin Deng
Chuanqi Tan
Fei Huang
Huajun Chen
ReLMELMLRM
232
327
0
19 Dec 2022
Latent Diffusion for Language Generation
Latent Diffusion for Language Generation
Justin Lovelace
Varsha Kishore
Chao-gang Wan
Eliot Shekhtman
Kilian Q. Weinberger
DiffM
132
82
0
19 Dec 2022
Medical Knowledge Graph QA for Drug-Drug Interaction Prediction based on
  Multi-hop Machine Reading Comprehension
Medical Knowledge Graph QA for Drug-Drug Interaction Prediction based on Multi-hop Machine Reading Comprehension
Peng Gao
Feng Gao
Jiancheng Ni
Yu Wang
Fei Wang
62
3
0
19 Dec 2022
AI Art in Architecture
AI Art in Architecture
J. Ploennigs
Markus Berger
DiffM
81
72
0
19 Dec 2022
Review of security techniques for memristor computing systems
Review of security techniques for memristor computing systems
Minhui Zou
Nan Du
Shahar Kvatinsky
AAML
26
7
0
19 Dec 2022
E-NER -- An Annotated Named Entity Recognition Corpus of Legal Text
E-NER -- An Annotated Named Entity Recognition Corpus of Legal Text
Ting Wai Terence Au
Ingemar J. Cox
Vasileios Lampos
AILaw
72
28
0
19 Dec 2022
MIGA: A Unified Multi-task Generation Framework for Conversational
  Text-to-SQL
MIGA: A Unified Multi-task Generation Framework for Conversational Text-to-SQL
Yingwen Fu
Wenjie Ou
Zhou Yu
Yue Lin
75
7
0
19 Dec 2022
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
PromptBoosting: Black-Box Text Classification with Ten Forward Passes
Bairu Hou
J. O'Connor
Jacob Andreas
Shiyu Chang
Yang Zhang
VLM
57
44
0
19 Dec 2022
Discovering Language Model Behaviors with Model-Written Evaluations
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
102
407
0
19 Dec 2022
Natural Language to Code Generation in Interactive Data Science
  Notebooks
Natural Language to Code Generation in Interactive Data Science Notebooks
Pengcheng Yin
Wen-Ding Li
Kefan Xiao
Abhishek Rao
Yeming Wen
...
Paige Bailey
Michele Catasta
Henryk Michalewski
Oleksandr Polozov
Charles Sutton
85
66
0
19 Dec 2022
ColoristaNet for Photorealistic Video Style Transfer
ColoristaNet for Photorealistic Video Style Transfer
Xiaowen Qiu
Ruize Xu
Boan He
Yingtao Zhang
Wenqiang Zhang
Weifeng Ge
59
0
0
19 Dec 2022
I2D2: Inductive Knowledge Distillation with NeuroLogic and
  Self-Imitation
I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation
Chandra Bhagavatula
Jena D. Hwang
Doug Downey
Ronan Le Bras
Ximing Lu
Lianhui Qin
Keisuke Sakaguchi
Swabha Swayamdipta
Peter West
Yejin Choi
103
34
0
19 Dec 2022
Emergent Analogical Reasoning in Large Language Models
Emergent Analogical Reasoning in Large Language Models
Taylor Webb
K. Holyoak
Hongjing Lu
ReLMELMLRMAI4CE
112
320
0
19 Dec 2022
Rainproof: An Umbrella To Shield Text Generators From
  Out-Of-Distribution Data
Rainproof: An Umbrella To Shield Text Generators From Out-Of-Distribution Data
Maxime Darrin
Pablo Piantanida
Pierre Colombo
OODD
222
15
0
18 Dec 2022
Medical Diagnosis with Large Scale Multimodal Transformers: Leveraging
  Diverse Data for More Accurate Diagnosis
Medical Diagnosis with Large Scale Multimodal Transformers: Leveraging Diverse Data for More Accurate Diagnosis
Firas Khader
Gustav Mueller-Franzes
Tian Wang
T. Han
Soroosh Tayebi Arasteh
...
Keno Bressem
Christiane Kuhl
S. Nebelung
Jakob Nikolas Kather
Daniel Truhn
32
6
0
18 Dec 2022
JEMMA: An Extensible Java Dataset for ML4Code Applications
JEMMA: An Extensible Java Dataset for ML4Code Applications
Anjan Karmakar
Miltiadis Allamanis
Romain Robbes
VLM
55
3
0
18 Dec 2022
Rethinking the Role of Scale for In-Context Learning: An
  Interpretability-based Case Study at 66 Billion Scale
Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale
Hritik Bansal
Karthik Gopalakrishnan
Saket Dingliwal
S. Bodapati
Katrin Kirchhoff
Dan Roth
LRM
90
51
0
18 Dec 2022
On the Connection between Invariant Learning and Adversarial Training
  for Out-of-Distribution Generalization
On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization
Shiji Xin
Yifei Wang
Jingtong Su
Yisen Wang
OOD
92
7
0
18 Dec 2022
Synthesis and Evaluation of a Domain-specific Large Data Set for
  Dungeons & Dragons
Synthesis and Evaluation of a Domain-specific Large Data Set for Dungeons & Dragons
Akila Peiris
Nisansa de Silva
54
5
0
18 Dec 2022
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment
Chen Zhang
L. F. D’Haro
Qiquan Zhang
Thomas Friedrichs
Haizhou Li
77
7
0
18 Dec 2022
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be
  Imitated?
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?
Ajay Patel
Nicholas Andrews
Chris Callison-Burch
75
7
0
18 Dec 2022
Language model acceptability judgements are not always robust to context
Language model acceptability judgements are not always robust to context
Koustuv Sinha
Jon Gauthier
Aaron Mueller
Kanishka Misra
Keren Fuentes
R. Levy
Adina Williams
90
18
0
18 Dec 2022
Foundation models in brief: A historical, socio-technical focus
Foundation models in brief: A historical, socio-technical focus
Johannes Schneider
VLM
71
9
0
17 Dec 2022
Controlling Styles in Neural Machine Translation with Activation Prompt
Controlling Styles in Neural Machine Translation with Activation Prompt
Yifan Wang
Zewei Sun
Shanbo Cheng
Weiguo Zheng
Mingxuan Wang
87
10
0
17 Dec 2022
Improving Cross-task Generalization of Unified Table-to-text Models with
  Compositional Task Configurations
Improving Cross-task Generalization of Unified Table-to-text Models with Compositional Task Configurations
Jifan Chen
Yuhao Zhang
Lan Liu
Rui Dong
Xinchi Chen
Patrick Ng
William Yang Wang
Zhiheng Huang
AI4CE
69
4
0
17 Dec 2022
Neural Story Planning
Neural Story Planning
Anbang Ye
Christopher Cui
Taiwei Shi
Mark O. Riedl
59
8
0
16 Dec 2022
Rarely a problem? Language models exhibit inverse scaling in their
  predictions following few-type quantifiers
Rarely a problem? Language models exhibit inverse scaling in their predictions following few-type quantifiers
J. Michaelov
Benjamin Bergen
44
17
0
16 Dec 2022
Evaluating Step-by-Step Reasoning through Symbolic Verification
Evaluating Step-by-Step Reasoning through Symbolic Verification
Yi-Fan Zhang
Hanlin Zhang
Li Erran Li
Eric P. Xing
ReLMLRM
92
8
0
16 Dec 2022
Plansformer: Generating Symbolic Plans using Transformers
Plansformer: Generating Symbolic Plans using Transformers
Vishal Pallagani
Bharath Muppasani
K. Murugesan
F. Rossi
L. Horesh
Biplav Srivastava
F. Fabiano
Andrea Loreggia
LM&RoLLMAGOffRL
74
38
0
16 Dec 2022
Hippocampus-Inspired Cognitive Architecture (HICA) for Operant
  Conditioning
Hippocampus-Inspired Cognitive Architecture (HICA) for Operant Conditioning
Deokgun Park
Md Ashaduzzaman Rubel Mondol
Sm Mazharul Islam
Aishwarya Pothula
45
0
0
16 Dec 2022
Previous
123...170171172...247248249
Next