ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.14165
  4. Cited By
Language Models are Few-Shot Learners
v1v2v3v4 (latest)

Language Models are Few-Shot Learners

28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
    BDL
ArXiv (abs)PDFHTML

Papers citing "Language Models are Few-Shot Learners"

50 / 12,483 papers shown
Title
Toward General Design Principles for Generative AI Applications
Toward General Design Principles for Generative AI Applications
Justin D. Weisz
Michael J. Muller
Jessica He
Stephanie Houde
AI4CE
98
59
0
13 Jan 2023
Hyperparameter Optimization as a Service on INFN Cloud
Hyperparameter Optimization as a Service on INFN Cloud
M. Barbetti
Lucio Anderlini
19
1
0
13 Jan 2023
It's Just a Matter of Time: Detecting Depression with Time-Enriched
  Multimodal Transformers
It's Just a Matter of Time: Detecting Depression with Time-Enriched Multimodal Transformers
Ana-Maria Bucur
Adrian Cosma
Paolo Rosso
Liviu P. Dinu
84
34
0
13 Jan 2023
Prompting Neural Machine Translation with Translation Memories
Prompting Neural Machine Translation with Translation Memories
Abudurexiti Reheman
Tao Zhou
Yingfeng Luo
Di Yang
Tong Xiao
Jingbo Zhu
AI4CEVLM
82
5
0
13 Jan 2023
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation
Simbarashe Nyatsanga
Taras Kucherenko
Chaitanya Ahuja
G. Henter
Michael Neff
SLR
114
94
0
13 Jan 2023
Blind Judgement: Agent-Based Supreme Court Modelling With GPT
Blind Judgement: Agent-Based Supreme Court Modelling With GPT
S. Hamilton
LLMAGELM
78
41
0
12 Jan 2023
Rock Guitar Tablature Generation via Natural Language Processing
Rock Guitar Tablature Generation via Natural Language Processing
Josue Casco-Rodriguez
73
1
0
12 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language
  Models for Knowledge-based Visual Reasoning
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Songlin Yang
Yining Hong
Hao Zhang
Chuang Gan
LRMVLM
116
41
0
12 Jan 2023
Why is the State of Neural Network Pruning so Confusing? On the
  Fairness, Comparison Setup, and Trainability in Network Pruning
Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Pruning
Huan Wang
Can Qin
Yue Bai
Yun Fu
103
21
0
12 Jan 2023
Progress measures for grokking via mechanistic interpretability
Progress measures for grokking via mechanistic interpretability
Neel Nanda
Lawrence Chan
Tom Lieberum
Jess Smith
Jacob Steinhardt
115
451
0
12 Jan 2023
Thompson Sampling with Diffusion Generative Prior
Thompson Sampling with Diffusion Generative Prior
Yu-Guan Hsieh
S. Kasiviswanathan
Branislav Kveton
Patrick Blobaum
DiffM
60
7
0
12 Jan 2023
Improving Inference Performance of Machine Learning with the
  Divide-and-Conquer Principle
Improving Inference Performance of Machine Learning with the Divide-and-Conquer Principle
Alex Kogan
LRM
126
0
0
12 Jan 2023
Toward Building General Foundation Models for Language, Vision, and
  Vision-Language Understanding Tasks
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding Tasks
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLMAI4CELRM
122
17
0
12 Jan 2023
ViTs for SITS: Vision Transformers for Satellite Image Time Series
ViTs for SITS: Vision Transformers for Satellite Image Time Series
Michail Tarasiou
Erik Chavez
Stefanos Zafeiriou
ViT
86
56
0
12 Jan 2023
Self-Attention Amortized Distributional Projection Optimization for
  Sliced Wasserstein Point-Cloud Reconstruction
Self-Attention Amortized Distributional Projection Optimization for Sliced Wasserstein Point-Cloud Reconstruction
Khai Nguyen
Dang Nguyen
N. Ho
78
9
0
12 Jan 2023
Language Cognition and Language Computation -- Human and Machine
  Language Understanding
Language Cognition and Language Computation -- Human and Machine Language Understanding
Shaonan Wang
Nai Ding
Nan Lin
Jiajun Zhang
Chengqing Zong
78
2
0
12 Jan 2023
Artificial Intelligence Generated Coins for Size Comparison
Artificial Intelligence Generated Coins for Size Comparison
Gerald Artner
62
0
0
11 Jan 2023
Exploring the Approximation Capabilities of Multiplicative Neural
  Networks for Smooth Functions
Exploring the Approximation Capabilities of Multiplicative Neural Networks for Smooth Functions
Ido Ben-Shaul
Tomer Galanti
S. Dekel
80
3
0
11 Jan 2023
Learning to Exploit Temporal Structure for Biomedical Vision-Language
  Processing
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
Shruthi Bannur
Stephanie L. Hyland
Qianchu Liu
Fernando Pérez-García
Maximilian Ilse
...
Maria T. A. Wetscherek
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
87
127
0
11 Jan 2023
The Role of Interactive Visualization in Explaining (Large) NLP Models:
  from Data to Inference
The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference
R. Brath
Daniel A. Keim
Johannes Knittel
Shimei Pan
Pia Sommerauer
Hendrik Strobelt
51
11
0
11 Jan 2023
Diving Deep into Modes of Fact Hallucinations in Dialogue Systems
Diving Deep into Modes of Fact Hallucinations in Dialogue Systems
Souvik Das
Sougata Saha
Rohini Srihari
HILM
68
34
0
11 Jan 2023
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities
GPT as Knowledge Worker: A Zero-Shot Evaluation of (AI)CPA Capabilities
Jillian Bommarito
M. Bommarito
Daniel Martin Katz
Jessica Katz
ELM
73
55
0
11 Jan 2023
TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching
  Algorithm for Deep Neural Networks
TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks
Peng Liang
Hao Zheng
Teng Su
Linbo Qiao
Dongsheng Li
48
0
0
11 Jan 2023
Data Distillation: A Survey
Data Distillation: A Survey
Noveen Sachdeva
Julian McAuley
DD
110
78
0
11 Jan 2023
Structured Case-based Reasoning for Inference-time Adaptation of
  Text-to-SQL parsers
Structured Case-based Reasoning for Inference-time Adaptation of Text-to-SQL parsers
Abhijeet Awasthi
Soumen Chakrabarti
Sunita Sarawagi
94
5
0
10 Jan 2023
There is No Big Brother or Small Brother: Knowledge Infusion in Language
  Models for Link Prediction and Question Answering
There is No Big Brother or Small Brother: Knowledge Infusion in Language Models for Link Prediction and Question Answering
Ankush Agarwal
Sakharam Gawade
Sachin Channabasavarajendra
P. Bhattacharyya
59
6
0
10 Jan 2023
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using
  Large Language Models
Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models
Toufique Ahmed
Supriyo Ghosh
Chetan Bansal
Thomas Zimmermann
Xuchao Zhang
Saravan Rajmohan
AI4CE
77
59
0
10 Jan 2023
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language
  Understanding
Cross-Model Comparative Loss for Enhancing Neuronal Utility in Language Understanding
Yunchang Zhu
Liang Pang
Kangxi Wu
Yanyan Lan
Huawei Shen
Xueqi Cheng
AAMLELM
59
2
0
10 Jan 2023
Memory Augmented Large Language Models are Computationally Universal
Memory Augmented Large Language Models are Computationally Universal
Dale Schuurmans
84
46
0
10 Jan 2023
Scaling Laws for Generative Mixed-Modal Language Models
Scaling Laws for Generative Mixed-Modal Language Models
Armen Aghajanyan
L. Yu
Alexis Conneau
Wei-Ning Hsu
Karen Hambardzumyan
Susan Zhang
Stephen Roller
Naman Goyal
Omer Levy
Luke Zettlemoyer
MoEVLM
100
110
0
10 Jan 2023
Designing BERT for Convolutional Networks: Sparse and Hierarchical
  Masked Modeling
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Keyu Tian
Yi Jiang
Qishuai Diao
Chen Lin
Liwei Wang
Zehuan Yuan
89
106
0
09 Jan 2023
Advances in Medical Image Analysis with Vision Transformers: A
  Comprehensive Review
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review
Reza Azad
Amirhossein Kazerouni
Moein Heidari
Ehsan Khodapanah Aghdam
Amir Molaei
Yiwei Jia
Abin Jose
Rijo Roy
Dorit Merhof
MedImViT
118
188
0
09 Jan 2023
Learning Bidirectional Action-Language Translation with Limited
  Supervision and Incongruent Input
Learning Bidirectional Action-Language Translation with Limited Supervision and Incongruent Input
Ozan Ozdemir
Matthias Kerzel
C. Weber
Jae Hee Lee
Muhammad Burhan Hafez
P. Bruns
S. Wermter
61
1
0
09 Jan 2023
Universal Information Extraction as Unified Semantic Matching
Universal Information Extraction as Unified Semantic Matching
Jie Lou
Yaojie Lu
Dai Dai
Wei Jia
Hongyu Lin
Xianpei Han
Le Sun
Hua Wu
80
72
0
09 Jan 2023
Removing Non-Stationary Knowledge From Pre-Trained Language Models for
  Entity-Level Sentiment Classification in Finance
Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance
Seunghyeok Hong
Hanwool Albert Lee
Nahyeon Kang
Moonjeong Hahm
67
8
0
09 Jan 2023
AI Maintenance: A Robustness Perspective
AI Maintenance: A Robustness Perspective
Pin-Yu Chen
Payel Das
84
14
0
08 Jan 2023
A Survey on Transformers in Reinforcement Learning
A Survey on Transformers in Reinforcement Learning
Wenzhe Li
Hao Luo
Zichuan Lin
Chongjie Zhang
Zongqing Lu
Deheng Ye
OffRLMUAI4CE
128
58
0
08 Jan 2023
Learning the Relation between Similarity Loss and Clustering Loss in
  Self-Supervised Learning
Learning the Relation between Similarity Loss and Clustering Loss in Self-Supervised Learning
Jidong Ge
YuXiang Liu
Jie Gui
Lanting Fang
Ming Lin
James T. Kwok
LiGuo Huang
B. Luo
SSL
86
5
0
08 Jan 2023
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers
Leonid Boytsov
Preksha Patel
Vivek Sourabh
Riddhi Nisar
Sayan Kundu
R. Ramanathan
Eric Nyberg
68
20
0
08 Jan 2023
Transferring Pre-trained Multimodal Representations with Cross-modal
  Similarity Matching
Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching
Byoungjip Kim
Sun Choi
Dasol Hwang
Moontae Lee
Honglak Lee
71
11
0
07 Jan 2023
Why do Nearest Neighbor Language Models Work?
Why do Nearest Neighbor Language Models Work?
Frank F. Xu
Uri Alon
Graham Neubig
RALM
71
23
0
07 Jan 2023
Witscript 3: A Hybrid AI System for Improvising Jokes in a Conversation
Witscript 3: A Hybrid AI System for Improvising Jokes in a Conversation
Joe Toplyn
60
9
0
06 Jan 2023
Systems for Parallel and Distributed Large-Model Deep Learning Training
Systems for Parallel and Distributed Large-Model Deep Learning Training
Kabir Nagrecha
GNNVLMMoE
74
7
0
06 Jan 2023
Does compressing activations help model parallel training?
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
74
9
0
06 Jan 2023
"No, to the Right" -- Online Language Corrections for Robotic
  Manipulation via Shared Autonomy
"No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy
Yuchen Cui
Siddharth Karamcheti
Raj Palleti
Nidhya Shivakumar
Percy Liang
Dorsa Sadigh
LM&Ro
115
83
0
06 Jan 2023
TrojanPuzzle: Covertly Poisoning Code-Suggestion Models
TrojanPuzzle: Covertly Poisoning Code-Suggestion Models
H. Aghakhani
Wei Dai
Andre Manoel
Xavier Fernandes
Anant Kharkar
Christopher Kruegel
Giovanni Vigna
David Evans
B. Zorn
Robert Sim
SILM
69
37
0
06 Jan 2023
Improving Human-AI Collaboration With Descriptions of AI Behavior
Improving Human-AI Collaboration With Descriptions of AI Behavior
Ángel Alexander Cabrera
Adam Perer
Jason I. Hong
84
40
0
06 Jan 2023
Evidence of behavior consistent with self-interest and altruism in an
  artificially intelligent agent
Evidence of behavior consistent with self-interest and altruism in an artificially intelligent agent
Tim Johnson
Nick Obradovich
35
6
0
05 Jan 2023
Sequentially Controlled Text Generation
Sequentially Controlled Text Generation
Alexander Spangher
Xinyu Hua
Yao Ming
Nanyun Peng
77
7
0
05 Jan 2023
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
Jia Ning
Chen Li
Zheng Zhang
Zigang Geng
Qi Dai
Kun He
Han Hu
130
47
0
05 Jan 2023
Previous
123...167168169...248249250
Next