Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 1,609 papers shown
Title
OR-Bench: An Over-Refusal Benchmark for Large Language Models
Justin Cui
Wei-Lin Chiang
Ion Stoica
Cho-Jui Hsieh
ALM
146
54
0
31 May 2024
Scaling White-Box Transformers for Vision
Jinrui Yang
Xianhang Li
Druv Pai
Yuyin Zhou
Yi-An Ma
Yaodong Yu
Cihang Xie
ViT
79
9
0
30 May 2024
Towards Hierarchical Multi-Agent Workflows for Zero-Shot Prompt Optimization
Yuchi Liu
Jaskirat Singh
Gaowen Liu
Ali Payani
Liang Zheng
LLMAG
101
6
0
30 May 2024
Is In-Context Learning Sufficient for Instruction Following in LLMs?
Hao Zhao
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
101
13
0
30 May 2024
Training-efficient density quantum machine learning
Brian Coyle
El Amine Cherrat
Nishant Jain
Natansh Mathur
Snehal Raj
Skander Kazdaghli
Iordanis Kerenidis
101
5
0
30 May 2024
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification
Laura Fieback
Jakob Spiegelberg
Hanno Gottschalk
MLLM
177
5
0
29 May 2024
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution
Minghan Li
Xilun Chen
Ari Holtzman
Beidi Chen
Jimmy Lin
Wen-tau Yih
Xi Lin
RALM
BDL
181
14
0
29 May 2024
A Causal Framework for Evaluating Deferring Systems
Filippo Palomba
Andrea Pugnana
Jose M. Alvarez
Salvatore Ruggieri
CML
129
4
0
29 May 2024
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice
Jian-Qiao Zhu
Haijiang Yan
Thomas Griffiths
122
3
0
29 May 2024
An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates
Albin Soutif--Cormerais
Simone Magistri
Joost van de Weijer
Andew D. Bagdanov
94
2
0
28 May 2024
Learning diverse attacks on large language models for robust red-teaming and safety tuning
Seanie Lee
Minsu Kim
Lynn Cherif
David Dobre
Juho Lee
...
Kenji Kawaguchi
Gauthier Gidel
Yoshua Bengio
Nikolay Malkin
Moksh Jain
AAML
128
20
0
28 May 2024
Adapting Pre-Trained Vision Models for Novel Instance Detection and Segmentation
Ya Lu
Jishnu Jaykumar
Yunhui Guo
Nicholas Ruozzi
Yu Xiang
VLM
ISeg
109
5
0
28 May 2024
Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting
Suraj Anand
Michael A. Lepori
Jack Merullo
Ellie Pavlick
CLL
95
8
0
28 May 2024
Metaheuristics and Large Language Models Join Forces: Toward an Integrated Optimization Approach
Camilo Chacón Sartori
Christian Blum
Filippo Bistaffa
Guillem Rodríguez Corominas
AIFin
109
4
0
28 May 2024
Glauber Generative Model: Discrete Diffusion Models via Binary Classification
Harshit Varma
Dheeraj M. Nagaraj
Karthikeyan Shanmugam
VLM
149
3
0
27 May 2024
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Dixuan Wang
Yanda Li
Junyuan Jiang
Zepeng Ding
Ziqin Luo
Guochao Jiang
Jiaqing Liang
Deqing Yang
87
15
0
27 May 2024
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling
Cristian Rodriguez-Opazo
Ehsan Abbasnejad
Damien Teney
Edison Marrese-Taylor
Hamed Damirchi
Anton Van Den Hengel
VLM
96
1
0
27 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
257
202
0
27 May 2024
LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence
Zhuoling Li
Xiaogang Xu
Zhenhua Xu
Sernam Lim
Hengshuang Zhao
LM&Ro
114
2
0
27 May 2024
CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion
Jiayi Yao
Hanchen Li
Yuhan Liu
Siddhant Ray
Yihua Cheng
Qizheng Zhang
Kuntai Du
Shan Lu
Junchen Jiang
91
24
0
26 May 2024
Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search
Max Liu
Chan-Hung Yu
Wei-Hsu Lee
Cheng-Wei Hung
Yen-Chun Chen
Shao-Hua Sun
97
5
0
26 May 2024
Unsupervised Meta-Learning via In-Context Learning
Anna Vettoruzzo
Lorenzo Braccaioli
Joaquin Vanschoren
M. Nowaczyk
SSL
103
1
0
25 May 2024
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement
Xiyao Wang
Jiuhai Chen
Zhaoyang Wang
Yuhang Zhou
Yiyang Zhou
...
Dinesh Manocha
Tom Goldstein
Parminder Bhatia
Furong Huang
Cao Xiao
147
38
0
24 May 2024
Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization
Zhe Li
Bicheng Ying
Zidong Liu
Chaosheng Dong
Haibo Yang
FedML
120
3
0
24 May 2024
Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models
Cong Lu
Shengran Hu
Jeff Clune
LLMAG
87
12
0
24 May 2024
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models
Abdelrahman Abdelhamed
Mahmoud Afifi
Alec Go
MLLM
VLM
93
3
0
24 May 2024
A Structure-Aware Framework for Learning Device Placements on Computation Graphs
Shukai Duan
Heng Ping
Nikos Kanakaris
Xiongye Xiao
Panagiotis Kyriakis
...
Guixiang Ma
Mihai Capota
Shahin Nazarian
Theodore L. Willke
Paul Bogdan
94
5
0
23 May 2024
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-Attention Networks
Michelle Halbheer
Dominik J. Mühlematter
Alexander Becker
Dominik Narnhofer
Helge Aasen
Konrad Schindler
Mehmet Özgür Türkoglu
UQCV
101
3
0
23 May 2024
OAC: Output-adaptive Calibration for Accurate Post-training Quantization
Ali Edalati
Alireza Ghaffari
M. Asgharian
Lu Hou
Boxing Chen
Vahid Partovi Nia
V. Nia
MQ
126
0
0
23 May 2024
Can LLMs Solve longer Math Word Problems Better?
Xin Xu
Tong Xiao
Zitong Chao
Zhenya Huang
Can Yang
Yang Wang
113
14
0
23 May 2024
Implicit In-context Learning
Zhuowei Li
Zihao Xu
Ligong Han
Yunhe Gao
Song Wen
Di Liu
Hao Wang
Dimitris N. Metaxas
106
3
0
23 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
292
54
0
23 May 2024
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalization
Zexi Li
Lingzhi Gao
Chao Wu
AI4CE
DiffM
112
3
0
23 May 2024
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
Wei Huang
Haotong Qin
Yangdong Liu
Yawei Li
Qinshuo Liu
Xianglong Liu
Luca Benini
Michele Magno
Shiming Zhang
Xiaojuan Qi
MQ
121
19
0
23 May 2024
Slaves to the Law of Large Numbers: An Asymptotic Equipartition Property for Perplexity in Generative Language Models
Avinash Mudireddy
Tyler Bell
R. Mudumbai
58
2
0
22 May 2024
DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying
Guanghui Wang
Dexi Liu
Jian-Yun Nie
Qizhi Wan
Rong Hu
Xiping Liu
Wanlong Liu
Jiaming Liu
293
0
0
22 May 2024
FiDeLiS: Faithful Reasoning in Large Language Model for Knowledge Graph Question Answering
Yuan Sui
Yufei He
Nian Liu
Xiaoxin He
Kun Wang
Bryan Hooi
LRM
135
11
0
22 May 2024
Single Image Unlearning: Efficient Machine Unlearning in Multimodal Large Language Models
Jiaqi Li
Qianshan Wei
Chuanyi Zhang
Guilin Qi
Miaozeng Du
Yongrui Chen
Sheng Bi
Fan Liu
VLM
MU
157
17
0
21 May 2024
Asymptotic theory of in-context learning by linear attention
Yue M. Lu
Mary I. Letey
Jacob A. Zavatone-Veth
Anindita Maiti
Cengiz Pehlevan
88
16
0
20 May 2024
(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
Minghao Wu
Jiahao Xu
Yulin Yuan
Gholamreza Haffari
Longyue Wang
Weihua Luo
Kaifu Zhang
LLMAG
168
27
0
20 May 2024
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback
Ruitao Chen
Liwei Wang
113
1
0
18 May 2024
Lean Attention: Hardware-Aware Scalable Attention Mechanism for the Decode-Phase of Transformers
Rya Sanovar
Srikant Bharadwaj
Renée St. Amant
Victor Rühle
Saravan Rajmohan
136
7
0
17 May 2024
ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios
Markus Bayer
Justin Lutz
Christian A. Reuter
121
7
0
17 May 2024
Mitigating Text Toxicity with Counterfactual Generation
Milan Bhan
Jean-Noel Vittaut
Nina Achache
Victor Legrand
Nicolas Chesneau
A. Blangero
Juliette Murris
Marie-Jeanne Lesot
MedIm
196
0
0
16 May 2024
Contextual Emotion Recognition using Large Vision Language Models
Yasaman Etesam
Özge Nilay Yalçin
Chuxuan Zhang
Angelica Lim
VLM
99
4
0
14 May 2024
Full Line Code Completion: Bringing AI to Desktop
Anton Semenkin
Vitaliy Bibaev
Yaroslav Sokolov
Kirill Krylov
Alexey Kalina
...
Mikhail Podvitskii
Petr Surkov
Yaroslav Golubev
Nikita Povarov
T. Bryksin
74
2
0
14 May 2024
SpeechVerse: A Large-scale Generalizable Audio Language Model
Nilaksh Das
Saket Dingliwal
S. Ronanki
Rohit Paturi
David Huang
...
Monica Sunkara
S. Srinivasan
Kyu J. Han
Katrin Kirchhoff
Katrin Kirchhoff
75
43
0
14 May 2024
DEPTH: Discourse Education through Pre-Training Hierarchically
Zachary Bamberger
Ofek Glick
Chaim Baskin
Yonatan Belinkov
107
0
0
13 May 2024
TANQ: An open domain dataset of table answered questions
Mubashara Akhtar
Chenxi Pang
Andreea Marzoca
Yasemin Altun
Julian Martin Eisenschlos
LMTD
RALM
92
2
0
13 May 2024
AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments
Samuel Schmidgall
Rojin Ziaei
Carl Harris
Eduardo Reis
Jeffrey Jopling
Michael Moor
142
54
0
13 May 2024
Previous
1
2
3
...
24
25
26
...
31
32
33
Next