ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,301 papers shown
Title
Language Modelling with Pixels
Language Modelling with Pixels
Phillip Rust
Jonas F. Lotz
Emanuele Bugliarello
Elizabeth Salesky
Miryam de Lhoneux
Desmond Elliott
VLM
107
46
0
14 Jul 2022
Neural Data-to-Text Generation Based on Small Datasets: Comparing the
  Added Value of Two Semi-Supervised Learning Approaches on Top of a Large
  Language Model
Neural Data-to-Text Generation Based on Small Datasets: Comparing the Added Value of Two Semi-Supervised Learning Approaches on Top of a Large Language Model
Chris van der Lee
Thiago Castro Ferreira
Chris Emmery
Travis J. Wiltshire
Emiel Krahmer
78
2
0
14 Jul 2022
BERTIN: Efficient Pre-Training of a Spanish Language Model using
  Perplexity Sampling
BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling
Javier de la Rosa
E. G. Ponferrada
Paulo Villegas
Pablo González de Prado Salas
Manu Romero
María Grandury
74
95
0
14 Jul 2022
Masked Autoencoders that Listen
Masked Autoencoders that Listen
Po-Yao (Bernie) Huang
Hu Xu
Juncheng Billy Li
Alexei Baevski
Michael Auli
Wojciech Galuba
Florian Metze
Christoph Feichtenhofer
142
290
0
13 Jul 2022
N-Grammer: Augmenting Transformers with latent n-grams
N-Grammer: Augmenting Transformers with latent n-grams
Aurko Roy
Rohan Anil
Guangda Lai
Benjamin Lee
Jeffrey Zhao
...
Yu
Phuong Dao
Christopher Fifty
Zhiwen Chen
Yonghui Wu
77
8
0
13 Jul 2022
Re2G: Retrieve, Rerank, Generate
Re2G: Retrieve, Rerank, Generate
Michael R. Glass
Gaetano Rossiello
Md. Faisal Mahbub Chowdhury
Ankita Rajaram Naik
Pengshan Cai
A. Gliozzo
RALM
89
96
0
13 Jul 2022
Does GNN Pretraining Help Molecular Representation?
Does GNN Pretraining Help Molecular Representation?
Ruoxi Sun
Hanjun Dai
Adams Wei Yu
SSLAI4CEGNN
65
75
0
13 Jul 2022
Wayformer: Motion Forecasting via Simple & Efficient Attention Networks
Wayformer: Motion Forecasting via Simple & Efficient Attention Networks
Nigamaa Nayakanti
Rami Al-Rfou
Aurick Zhou
Kratarth Goel
Khaled S. Refaat
Benjamin Sapp
AI4TS
135
259
0
12 Jul 2022
A new hope for network model generalization
A new hope for network model generalization
Alexander Dietmüller
Siddhant Ray
Romain Jacob
Laurent Vanbever
AI4CE
95
39
0
12 Jul 2022
Vision Transformer for NeRF-Based View Synthesis from a Single Input
  Image
Vision Transformer for NeRF-Based View Synthesis from a Single Input Image
Kai-En Lin
Yen-Chen Lin
Wei-Sheng Lai
Nayeon Lee
Yichang Shih
R. Ramamoorthi
ViT
90
114
0
12 Jul 2022
Inner Monologue: Embodied Reasoning through Planning with Language
  Models
Inner Monologue: Embodied Reasoning through Planning with Language Models
Wenlong Huang
F. Xia
Ted Xiao
Harris Chan
Jacky Liang
...
Tomas Jackson
Linda Luu
Sergey Levine
Karol Hausman
Brian Ichter
LLMAGLM&RoLRM
184
926
0
12 Jul 2022
CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm
CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm
Mingye Xu
Yali Wang
Yihao Liu
Tong He
Yu Qiao
3DPC
86
17
0
12 Jul 2022
Bootstrapping a User-Centered Task-Oriented Dialogue System
Bootstrapping a User-Centered Task-Oriented Dialogue System
Shijie Chen
Ziru Chen
Xiang Deng
A. Lewis
Lingbo Mo
...
Zhen Wang
Xiang Yue
Tianshu Zhang
Yu-Chuan Su
Huan Sun
50
2
0
11 Jul 2022
Towards Neural Numeric-To-Text Generation From Temporal Personal Health
  Data
Towards Neural Numeric-To-Text Generation From Temporal Personal Health Data
Jon Harris
Mohammed J Zaki
AI4TS
60
2
0
11 Jul 2022
Exploring Length Generalization in Large Language Models
Exploring Length Generalization in Large Language Models
Cem Anil
Yuhuai Wu
Anders Andreassen
Aitor Lewkowycz
Vedant Misra
V. Ramasesh
Ambrose Slone
Guy Gur-Ari
Ethan Dyer
Behnam Neyshabur
ReLMLRM
111
170
0
11 Jul 2022
A clinically motivated self-supervised approach for content-based image
  retrieval of CT liver images
A clinically motivated self-supervised approach for content-based image retrieval of CT liver images
Kristoffer Wickstrøm
Eirik Agnalt Ostmo
Keyur Radiya
Karl Øyvind Mikalsen
Michael C. Kampffmeyer
Robert Jenssen
SSL
92
16
0
11 Jul 2022
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic
  Reinforcement Learning
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning
Homer Walke
Jonathan Yang
Albert Yu
Aviral Kumar
Jedrzej Orbik
Avi Singh
Sergey Levine
OffRLOnRL
90
32
0
11 Jul 2022
TCR: A Transformer Based Deep Network for Predicting Cancer Drugs
  Response
TCR: A Transformer Based Deep Network for Predicting Cancer Drugs Response
Jie Gao
Jing Hu
Wan-Na Sun
Yili Shen
Xiaonan Zhang
Xiaomin Fang
Fan Wang
Guo-Guo Zhao
MedIm
41
2
0
10 Jul 2022
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language,
  Vision, and Action
LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action
Dhruv Shah
B. Osinski
Brian Ichter
Sergey Levine
LM&Ro
260
470
0
10 Jul 2022
Radiomics-Guided Global-Local Transformer for Weakly Supervised
  Pathology Localization in Chest X-Rays
Radiomics-Guided Global-Local Transformer for Weakly Supervised Pathology Localization in Chest X-Rays
Yan Han
G. Holste
Ying Ding
Ahmed H. Tewfik
Yifan Peng
Zhangyang Wang
LM&MAViT
121
16
0
10 Jul 2022
Few-shot training LLMs for project-specific code-summarization
Few-shot training LLMs for project-specific code-summarization
Toufique Ahmed
Prem Devanbu
235
241
0
09 Jul 2022
SCouT: Synthetic Counterfactuals via Spatiotemporal Transformers for
  Actionable Healthcare
SCouT: Synthetic Counterfactuals via Spatiotemporal Transformers for Actionable Healthcare
Bhishma Dedhia
Roshini Balasubramanian
N. Jha
MedIm
46
4
0
09 Jul 2022
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via
  Sequence Modeling
Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling
Tung Nguyen
Aditya Grover
BDLUQCV
87
107
0
09 Jul 2022
TalkToModel: Explaining Machine Learning Models with Interactive Natural
  Language Conversations
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations
Dylan Slack
Satyapriya Krishna
Himabindu Lakkaraju
Sameer Singh
84
83
0
08 Jul 2022
Automatic Exploration of Textual Environments with Language-Conditioned
  Autotelic Agents
Automatic Exploration of Textual Environments with Language-Conditioned Autotelic Agents
Laetitia Teodorescu
Xingdi Yuan
Marc-Alexandre Côté
Pierre-Yves Oudeyer
LLMAG
56
0
0
08 Jul 2022
Big Learning
Big Learning
Yulai Cong
Miaoyun Zhao
AI4CE
91
0
0
08 Jul 2022
Boosting Zero-shot Learning via Contrastive Optimization of Attribute
  Representations
Boosting Zero-shot Learning via Contrastive Optimization of Attribute Representations
Yu Du
Miaojing Shi
Fangyun Wei
Guoqi Li
VLM
121
15
0
08 Jul 2022
Meta-Learning the Difference: Preparing Large Language Models for
  Efficient Adaptation
Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation
Zejiang Hou
Julian Salazar
George Polovets
71
15
0
07 Jul 2022
Training Transformers Together
Training Transformers Together
Alexander Borzunov
Max Ryabinin
Tim Dettmers
Quentin Lhoest
Lucile Saulnier
Michael Diskin
Yacine Jernite
Thomas Wolf
ViT
63
10
0
07 Jul 2022
Can Language Models perform Abductive Commonsense Reasoning?
Can Language Models perform Abductive Commonsense Reasoning?
Seung-Kyum Kim
ReLMLRM
45
1
0
07 Jul 2022
The "Collections as ML Data" Checklist for Machine Learning & Cultural
  Heritage
The "Collections as ML Data" Checklist for Machine Learning & Cultural Heritage
Benjamin Charles Germain Lee
VLM
43
7
0
06 Jul 2022
When does Bias Transfer in Transfer Learning?
When does Bias Transfer in Transfer Learning?
Hadi Salman
Saachi Jain
Andrew Ilyas
Logan Engstrom
Eric Wong
Aleksander Madry
89
37
0
06 Jul 2022
Rethinking the Value of Gazetteer in Chinese Named Entity Recognition
Rethinking the Value of Gazetteer in Chinese Named Entity Recognition
Qianglong Chen
Xiangji Zeng
Jiangang Zhu
Yin Zhang
Bojia Lin
Yang Yang
Daxin Jiang
55
2
0
06 Jul 2022
Don't Pay Attention to the Noise: Learning Self-supervised
  Representations of Light Curves with a Denoising Time Series Transformer
Don't Pay Attention to the Noise: Learning Self-supervised Representations of Light Curves with a Denoising Time Series Transformer
M. Morvan
N. Nikolaou
K. H. Yip
Ingo P. Waldmann
AI4TS
126
11
0
06 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
99
201
0
06 Jul 2022
Transformers are Adaptable Task Planners
Transformers are Adaptable Task Planners
Vidhi Jain
Yixin Lin
Eric Undersander
Yonatan Bisk
Akshara Rai
113
24
0
06 Jul 2022
Robustness Analysis of Video-Language Models Against Visual and Language
  Perturbations
Robustness Analysis of Video-Language Models Against Visual and Language Perturbations
Madeline Chantry Schiappa
Shruti Vyas
Hamid Palangi
Yogesh S Rawat
Vibhav Vineet
VLM
162
20
0
05 Jul 2022
Betty: An Automatic Differentiation Library for Multilevel Optimization
Betty: An Automatic Differentiation Library for Multilevel Optimization
Sang Keun Choe
Willie Neiswanger
P. Xie
Eric P. Xing
AI4CE
81
32
0
05 Jul 2022
Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in
  Icelandic
Cross-Lingual QA as a Stepping Stone for Monolingual Open QA in Icelandic
Vésteinn Snæbjarnarson
H. Einarsson
57
6
0
05 Jul 2022
TabPFN: A Transformer That Solves Small Tabular Classification Problems
  in a Second
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Noah Hollmann
Samuel G. Müller
Katharina Eggensperger
Frank Hutter
121
317
0
05 Jul 2022
Softmax-free Linear Transformers
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
54
8
0
05 Jul 2022
CodeRL: Mastering Code Generation through Pretrained Models and Deep
  Reinforcement Learning
CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
Hung Le
Yue Wang
Akhilesh Deepak Gotmare
Silvio Savarese
Guosheng Lin
SyDaALM
227
273
0
05 Jul 2022
Probing via Prompting
Probing via Prompting
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
109
13
0
04 Jul 2022
Location reference recognition from texts: A survey and comparison
Location reference recognition from texts: A survey and comparison
Xuke Hu
Zhiyong Zhou
Hao Li
Yingjie Hu
F. Gu
J. Kersten
H. Fan
Friederike Klan
55
51
0
04 Jul 2022
Comparing Feature Importance and Rule Extraction for Interpretability on
  Text Data
Comparing Feature Importance and Rule Extraction for Interpretability on Text Data
Gianluigi Lopardo
Damien Garreau
FAtt
98
1
0
04 Jul 2022
WebShop: Towards Scalable Real-World Web Interaction with Grounded
  Language Agents
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
Shunyu Yao
Howard Chen
John Yang
Karthik Narasimhan
LLMAGLM&Ro
176
522
0
04 Jul 2022
PrUE: Distilling Knowledge from Sparse Teacher Networks
PrUE: Distilling Knowledge from Sparse Teacher Networks
Shaopu Wang
Xiaojun Chen
Mengzhen Kou
Jinqiao Shi
106
2
0
03 Jul 2022
An Empirical Survey on Long Document Summarization: Datasets, Models and
  Metrics
An Empirical Survey on Long Document Summarization: Datasets, Models and Metrics
Huan Yee Koh
Jiaxin Ju
Ming Liu
Shirui Pan
147
128
0
03 Jul 2022
Rationale-Augmented Ensembles in Language Models
Rationale-Augmented Ensembles in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Denny Zhou
ReLMLRM
119
126
0
02 Jul 2022
Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk
Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk
Benyou Wang
Xiang Wu
Xiaokang Liu
Jianquan Li
Prayag Tiwari
Qianqian Xie
61
6
0
02 Jul 2022
Previous
123...191192193...245246247
Next