ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,362 papers shown
Title
Multi-step Planning for Automated Hyperparameter Optimization with
  OptFormer
Multi-step Planning for Automated Hyperparameter Optimization with OptFormer
Lucio Dery
A. Friesen
Nando de Freitas
MarcÁurelio Ranzato
Yutian Chen
81
1
0
10 Oct 2022
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
CORE: A Retrieve-then-Edit Framework for Counterfactual Data Generation
Tanay Dixit
Bhargavi Paranjape
Hannaneh Hajishirzi
Luke Zettlemoyer
SyDa
197
26
0
10 Oct 2022
Visual Prompt Tuning for Test-time Domain Adaptation
Visual Prompt Tuning for Test-time Domain Adaptation
Yunhe Gao
Xingjian Shi
Yi Zhu
Hongya Wang
Zhiqiang Tang
Xiong Zhou
Mu Li
Dimitris N. Metaxas
VPVLMVLM
176
89
0
10 Oct 2022
Quantifying Social Biases Using Templates is Unreliable
Quantifying Social Biases Using Templates is Unreliable
P. Seshadri
Pouya Pezeshkpour
Sameer Singh
89
34
0
09 Oct 2022
SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency
  of Adapters
SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters
Shwai He
Liang Ding
Daize Dong
Miao Zhang
Dacheng Tao
MoE
135
91
0
09 Oct 2022
Noise-Robust De-Duplication at Scale
Noise-Robust De-Duplication at Scale
Emily Silcock
Luca DÁmico-Wong
Jinglin Yang
Melissa Dell
SyDa
85
20
0
09 Oct 2022
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
H. H. Mao
114
22
0
09 Oct 2022
Understanding and Improving Zero-shot Multi-hop Reasoning in Generative
  Question Answering
Understanding and Improving Zero-shot Multi-hop Reasoning in Generative Question Answering
Zhengbao Jiang
Jun Araki
Haibo Ding
Graham Neubig
LRM
76
11
0
09 Oct 2022
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text
  Generation Models
CHARD: Clinical Health-Aware Reasoning Across Dimensions for Text Generation Models
Steven Y. Feng
Vivek Khetan
Bogdan Sacaleanu
A. Gershman
Eduard H. Hovy
LRM
88
10
0
09 Oct 2022
Analogy Generation by Prompting Large Language Models: A Case Study of
  InstructGPT
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT
B. Bhavya
Jinjun Xiong
Chengxiang Zhai
LRM
84
44
0
09 Oct 2022
Controllable Dialogue Simulation with In-Context Learning
Controllable Dialogue Simulation with In-Context Learning
Zekun Li
Wenhu Chen
Shiyang Li
Hong Wang
Jingu Qian
Xi Yan
215
47
0
09 Oct 2022
Advancing Model Pruning via Bi-level Optimization
Advancing Model Pruning via Bi-level Optimization
Yihua Zhang
Yuguang Yao
Parikshit Ram
Pu Zhao
Tianlong Chen
Min-Fong Hong
Yanzhi Wang
Sijia Liu
152
68
0
08 Oct 2022
Understanding HTML with Large Language Models
Understanding HTML with Large Language Models
Izzeddin Gur
Ofir Nachum
Yingjie Miao
Mustafa Safdari
Austin Huang
Aakanksha Chowdhery
Sharan Narang
Noah Fiedel
Aleksandra Faust
AI4CE
225
71
0
08 Oct 2022
Short Text Pre-training with Extended Token Classification for
  E-commerce Query Understanding
Short Text Pre-training with Extended Token Classification for E-commerce Query Understanding
Haoming Jiang
Tianyu Cao
Zheng Li
Cheng-hsin Luo
Xianfeng Tang
Qingyu Yin
Danqing Zhang
R. Goutam
Bing Yin
RALM
69
12
0
08 Oct 2022
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of
  Large-Scale Pre-Trained Language Models
AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models
S. Kwon
Jeonghoon Kim
Jeongin Bae
Kang Min Yoo
Jin-Hwa Kim
Baeseong Park
Byeongwook Kim
Jung-Woo Ha
Nako Sung
Dongsoo Lee
MQ
117
31
0
08 Oct 2022
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational
  Finance Question Answering
ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering
Zhiyu Zoey Chen
Shiyang Li
Charese Smiley
Zhiqiang Ma
Sameena Shah
William Yang Wang
AIMatLRMAI4CE
150
116
0
07 Oct 2022
An Analysis of the Effects of Decoding Algorithms on Fairness in
  Open-Ended Language Generation
An Analysis of the Effects of Decoding Algorithms on Fairness in Open-Ended Language Generation
Jwala Dhamala
Varun Kumar
Rahul Gupta
Kai-Wei Chang
Aram Galstyan
66
7
0
07 Oct 2022
LLMEffiChecker: Understanding and Testing Efficiency Degradation of
  Large Language Models
LLMEffiChecker: Understanding and Testing Efficiency Degradation of Large Language Models
Simin Chen
Cong Liu
Mirazul Haque
Wei Yang
89
24
0
07 Oct 2022
Few-Shot Anaphora Resolution in Scientific Protocols via Mixtures of
  In-Context Experts
Few-Shot Anaphora Resolution in Scientific Protocols via Mixtures of In-Context Experts
Nghia T. Le
Fan Bai
Alan Ritter
135
12
0
07 Oct 2022
Artificial Intelligence and Natural Language Processing and
  Understanding in Space: A Methodological Framework and Four ESA Case Studies
Artificial Intelligence and Natural Language Processing and Understanding in Space: A Methodological Framework and Four ESA Case Studies
José Manuél Gómez-Pérez
Andrés García-Silva
R. Leone
M. Albani
Moritz Fontaine
C. Poncet
L. Summerer
A. Donati
Ilaria Roma
Stefano Scaglioni
63
1
0
07 Oct 2022
How Large Language Models are Transforming Machine-Paraphrased
  Plagiarism
How Large Language Models are Transforming Machine-Paraphrased Plagiarism
Jan Philip Wahle
Terry Ruas
Frederic Kirstein
Bela Gipp
77
35
0
07 Oct 2022
Automatic Chain of Thought Prompting in Large Language Models
Automatic Chain of Thought Prompting in Large Language Models
Zhuosheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLMLRM
180
639
0
07 Oct 2022
The Lifecycle of "Facts": A Survey of Social Bias in Knowledge Graphs
The Lifecycle of "Facts": A Survey of Social Bias in Knowledge Graphs
Angelie Kraft
Ricardo Usbeck
KELM
79
9
0
07 Oct 2022
Measuring and Narrowing the Compositionality Gap in Language Models
Measuring and Narrowing the Compositionality Gap in Language Models
Ofir Press
Muru Zhang
Sewon Min
Ludwig Schmidt
Noah A. Smith
M. Lewis
ReLMKELMLRM
266
646
0
07 Oct 2022
A Unified Framework for Multi-intent Spoken Language Understanding with
  prompting
A Unified Framework for Multi-intent Spoken Language Understanding with prompting
Feifan Song
Lianzhe Huang
Houfeng Wang
51
3
0
07 Oct 2022
Calibrating Factual Knowledge in Pretrained Language Models
Calibrating Factual Knowledge in Pretrained Language Models
Qingxiu Dong
Damai Dai
Yifan Song
Jingjing Xu
Zhifang Sui
Lei Li
KELM
311
90
0
07 Oct 2022
Scalable Self-Supervised Representation Learning from Spatiotemporal
  Motion Trajectories for Multimodal Computer Vision
Scalable Self-Supervised Representation Learning from Spatiotemporal Motion Trajectories for Multimodal Computer Vision
Swetava Ganguli
C. V. K. Iyer
Vipul Pandey
SSL
90
5
0
07 Oct 2022
Achieving and Understanding Out-of-Distribution Generalization in
  Systematic Reasoning in Small-Scale Transformers
Achieving and Understanding Out-of-Distribution Generalization in Systematic Reasoning in Small-Scale Transformers
A. Nam
Mustafa Abdool
Trevor C. Maxfield
James L. McClelland
NAILRMAI4CE
57
1
0
07 Oct 2022
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision
  Tasks
Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks
Yen-Cheng Liu
Chih-Yao Ma
Junjiao Tian
Zijian He
Z. Kira
160
52
0
07 Oct 2022
Improving Large-scale Paraphrase Acquisition and Generation
Improving Large-scale Paraphrase Acquisition and Generation
Yao Dou
Chao Jiang
Wei Xu
99
9
0
06 Oct 2022
Prompt Compression and Contrastive Conditioning for Controllability and
  Toxicity Reduction in Language Models
Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models
David Wingate
Mohammad Shoeybi
Taylor Sorensen
89
77
0
06 Oct 2022
Real-World Robot Learning with Masked Visual Pre-training
Real-World Robot Learning with Masked Visual Pre-training
Ilija Radosavovic
Tete Xiao
Stephen James
Pieter Abbeel
Jitendra Malik
Trevor Darrell
SSL
244
254
0
06 Oct 2022
VIMA: General Robot Manipulation with Multimodal Prompts
VIMA: General Robot Manipulation with Multimodal Prompts
Yunfan Jiang
Agrim Gupta
Zichen Zhang
Guanzhi Wang
Yongqiang Dou
Yanjun Chen
Li Fei-Fei
Anima Anandkumar
Yuke Zhu
Linxi Fan
LM&Ro
117
355
0
06 Oct 2022
Rainier: Reinforced Knowledge Introspector for Commonsense Question
  Answering
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
Jiacheng Liu
Skyler Hallinan
Ximing Lu
Pengfei He
Sean Welleck
Hannaneh Hajishirzi
Yejin Choi
RALM
99
60
0
06 Oct 2022
Toxicity in Multilingual Machine Translation at Scale
Toxicity in Multilingual Machine Translation at Scale
Marta R. Costa-jussá
Eric Michael Smith
C. Ropers
Daniel Licht
Jean Maillard
Javier Ferrando
Carlos Escolano
96
27
0
06 Oct 2022
Language Models are Multilingual Chain-of-Thought Reasoners
Language Models are Multilingual Chain-of-Thought Reasoners
Freda Shi
Mirac Suzgun
Markus Freitag
Xuezhi Wang
Suraj Srivats
...
Yi Tay
Sebastian Ruder
Denny Zhou
Dipanjan Das
Jason W. Wei
ReLMLRM
253
369
0
06 Oct 2022
ByteTransformer: A High-Performance Transformer Boosted for
  Variable-Length Inputs
ByteTransformer: A High-Performance Transformer Boosted for Variable-Length Inputs
Yujia Zhai
Chengquan Jiang
Leyuan Wang
Xiaoying Jia
Shang Zhang
Zizhong Chen
Xin Liu
Yibo Zhu
141
52
0
06 Oct 2022
State-of-the-art generalisation research in NLP: A taxonomy and review
State-of-the-art generalisation research in NLP: A taxonomy and review
Dieuwke Hupkes
Mario Giulianelli
Verna Dankers
Mikel Artetxe
Yanai Elazar
...
Leila Khalatbari
Maria Ryskina
Rita Frieske
Ryan Cotterell
Zhijing Jin
270
99
0
06 Oct 2022
Efficiently Enhancing Zero-Shot Performance of Instruction Following
  Model via Retrieval of Soft Prompt
Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt
Seonghyeon Ye
Joel Jang
Doyoung Kim
Yongrae Jo
Minjoon Seo
VLM
90
2
0
06 Oct 2022
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using
  Synthetic Data
SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data
Ching-Yun Ko
Pin-Yu Chen
Jeet Mohapatra
Payel Das
Lucani E. Daniel
111
3
0
06 Oct 2022
Guess the Instruction! Flipped Learning Makes Language Models Stronger
  Zero-Shot Learners
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
Seonghyeon Ye
Doyoung Kim
Joel Jang
Joongbo Shin
Minjoon Seo
FedMLVLMUQCVLRM
113
25
0
06 Oct 2022
Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation
Improving the Sample Efficiency of Prompt Tuning with Domain Adaptation
Xu Guo
Boyang Albert Li
Han Yu
VLM
121
24
0
06 Oct 2022
BootAug: Boosting Text Augmentation via Hybrid Instance Filtering
  Framework
BootAug: Boosting Text Augmentation via Hybrid Instance Filtering Framework
Heng Yang
Ke Li
99
6
0
06 Oct 2022
Grape: Knowledge Graph Enhanced Passage Reader for Open-domain Question
  Answering
Grape: Knowledge Graph Enhanced Passage Reader for Open-domain Question Answering
Mingxuan Ju
Wenhao Yu
Tong Zhao
Chuxu Zhang
Yanfang Ye
120
24
0
06 Oct 2022
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question
  Answering over Images and Text
MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text
Wenhu Chen
Hexiang Hu
Xi Chen
Pat Verga
William W. Cohen
RALM
100
160
0
06 Oct 2022
A Distributional Lens for Multi-Aspect Controllable Text Generation
A Distributional Lens for Multi-Aspect Controllable Text Generation
Yuxuan Gu
Xiaocheng Feng
Sicheng Ma
Lingyuan Zhang
Heng Gong
Bing Qin
176
37
0
06 Oct 2022
Binding Language Models in Symbolic Languages
Binding Language Models in Symbolic Languages
Zhoujun Cheng
Tianbao Xie
Peng Shi
Chengzu Li
Rahul Nadkarni
...
Dragomir R. Radev
Mari Ostendorf
Luke Zettlemoyer
Noah A. Smith
Tao Yu
LMTD
232
215
0
06 Oct 2022
XDoc: Unified Pre-training for Cross-Format Document Understanding
XDoc: Unified Pre-training for Cross-Format Document Understanding
Jingye Chen
Tengchao Lv
Lei Cui
Changrong Zhang
Furu Wei
95
14
0
06 Oct 2022
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG)
  Models for Open Domain Question Answering
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering
Shamane Siriwardhana
Rivindu Weerasekera
Elliott Wen
Tharindu Kaluarachchi
R. Rana
Suranga Nanayakkara
VLM
84
187
0
06 Oct 2022
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAGReLMLRM
473
2,998
0
06 Oct 2022
Previous
123...183184185...246247248
Next