Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 10,741 papers shown
Title
GraphPrompter: Multi-stage Adaptive Prompt Optimization for Graph In-Context Learning
Rui Lv
Zhenru Zhang
Kai Zhang
Qi Liu
Weibo Gao
Jing Liu
Jiaxia Yan
Linan Yue
Fangzhou Yao
154
0
0
04 May 2025
What do Language Model Probabilities Represent? From Distribution Estimation to Response Prediction
Eitan Wagner
Omri Abend
39
0
0
04 May 2025
DNAZEN: Enhanced Gene Sequence Representations via Mixed Granularities of Coding Units
Lei Mao
Yuanhe Tian
Yan Song
23
0
0
04 May 2025
A survey of agent interoperability protocols: Model Context Protocol (MCP), Agent Communication Protocol (ACP), Agent-to-Agent Protocol (A2A), and Agent Network Protocol (ANP)
Abul Ehtesham
Aditi Singh
Gaurav Kumar Gupta
Saket Kumar
38
1
0
04 May 2025
Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs
Sai Krishna Mendu
Harish Yenala
Aditi Gulati
Shanu Kumar
Parag Agrawal
36
0
0
04 May 2025
LLM-OptiRA: LLM-Driven Optimization of Resource Allocation for Non-Convex Problems in Wireless Communications
Xinyue Peng
Yanming Liu
Yihan Cang
Chaoqun Cao
Ming Chen
56
0
0
04 May 2025
Improving Physical Object State Representation in Text-to-Image Generative Systems
Tianle Chen
Chaitanya Chakka
Deepti Ghadiyaram
34
0
0
04 May 2025
SEval-Ex: A Statement-Level Framework for Explainable Summarization Evaluation
Tanguy Herserant
Vincent Guigue
ELM
40
0
0
04 May 2025
Measuring Hong Kong Massive Multi-Task Language Understanding
Chuxue Cao
Zhenghao Zhu
Junqi Zhu
Guoying Lu
Siyu Peng
Juntao Dai
Weijie Shi
Sirui Han
Yike Guo
ELM
160
0
0
04 May 2025
SimAug: Enhancing Recommendation with Pretrained Language Models for Dense and Balanced Data Augmentation
Yuying Zhao
Xiaodong Yang
Huiyuan Chen
Xiran Fan
Yu-Chiang Frank Wang
Y. Cai
Tyler Derr
29
0
0
03 May 2025
Neural Orchestration for Multi-Agent Systems: A Deep Learning Framework for Optimal Agent Selection in Multi-Domain Task Environments
Kushagra Agrawal
Nisharg Nargund
AI4CE
15
0
0
03 May 2025
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun
Han Yu
Lizhen Cui
Xiaoxiao Li
105
0
0
03 May 2025
Intra-Layer Recurrence in Transformers for Language Modeling
Anthony Nguyen
Wenjun Lin
31
0
0
03 May 2025
High-Fidelity Pseudo-label Generation by Large Language Models for Training Robust Radiology Report Classifiers
Brian Wong
Kaito Tanaka
37
0
0
03 May 2025
Towards Artificial Intelligence Research Assistant for Expert-Involved Learning
Tianyu Liu
Simeng Han
Xiao Luo
Haoyu Wang
Pan Lu
...
Arman Cohan
Hua Xu
Mark B. Gerstein
James Zou
Hongyu Zhao
39
0
0
03 May 2025
Vision and Intention Boost Large Language Model in Long-Term Action Anticipation
Congqi Cao
Lanshu Hu
Yating Yu
Y. Zhang
VLM
152
0
0
03 May 2025
Knowledge-Augmented Language Models Interpreting Structured Chest X-Ray Findings
Alexander Davis
Rafael Souza
Jia-Hao Lim
109
0
0
03 May 2025
Exploring the Role of Diversity in Example Selection for In-Context Learning
Janak Kapuriya
Manit Kaushik
Debasis Ganguly
S. Bhatia
27
0
0
03 May 2025
LLM Security: Vulnerabilities, Attacks, Defenses, and Countermeasures
Francisco Aguilera-Martínez
Fernando Berzal
PILM
55
0
0
02 May 2025
Enhancing User Sequence Modeling through Barlow Twins-based Self-Supervised Learning
Yuhan Liu
Lin Ning
Neo Wu
Karan Singhal
Philip Mansfield
D. Berlowitz
Sushant Prakash
Bradley Green
SSL
54
0
0
02 May 2025
Accelerating Deep Neural Network Training via Distributed Hybrid Order Optimization
Shunxian Gu
Chaoqun You
Bangbang Ren
Lailong Luo
Junxu Xia
Deke Guo
44
0
0
02 May 2025
Enhancing ML Model Interpretability: Leveraging Fine-Tuned Large Language Models for Better Understanding of AI
Jonas Bokstaller
Julia Altheimer
Julian Dormehl
Alina Buss
Jasper Wiltfang
Johannes Schneider
Maximilian Röglinger
24
0
0
02 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
44
0
0
02 May 2025
FlowDubber: Movie Dubbing with LLM-based Semantic-aware Learning and Flow Matching based Voice Enhancing
Gaoxiang Cong
Liang-Sheng Li
Jiadong Pan
Zhedong Zhang
Amin Beheshti
Anton Van Den Hengel
Yuankai Qi
Qingming Huang
147
0
0
02 May 2025
Grounding Task Assistance with Multimodal Cues from a Single Demonstration
Gabriel Sarch
Balasaravanan Thoravi Kumaravel
Sahithya Ravi
Vibhav Vineet
A. D. Wilson
167
0
0
02 May 2025
AI agents may be worth the hype but not the resources (yet): An initial exploration of machine translation quality and costs in three language pairs in the legal and news domains
Vicent Briva-Iglesias
Gokhan Dogru
LLMAG
ELM
34
0
0
02 May 2025
Multi-agents based User Values Mining for Recommendation
L. Chen
Wei Yuan
Tong Chen
Xiangyu Zhao
Nguyen Quoc Viet Hung
Hongzhi Yin
OffRL
49
0
0
02 May 2025
Nesterov Method for Asynchronous Pipeline Parallel Optimization
Thalaiyasingam Ajanthan
Sameera Ramasinghe
Yan Zuo
Gil Avraham
Alexander Long
24
0
0
02 May 2025
Synthesize-on-Graph: Knowledgeable Synthetic Data Generation for Continue Pre-training of Large Language Models
Xuhui Jiang
Shengjie Ma
Chengjin Xu
Cehao Yang
Liyu Zhang
Jian Guo
SyDa
30
0
0
02 May 2025
MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning
Murtadha Ahmed
Wenbo
Liu yunfeng
41
0
0
02 May 2025
TSTMotion: Training-free Scene-aware Text-to-motion Generation
Ziyan Guo
Haoxuan Qu
Hossein Rahmani
Dewen Soh
Ping Hu
Qiuhong Ke
Xiaozhong Liu
VGen
68
1
0
02 May 2025
Artificial Intelligence in Government: Why People Feel They Lose Control
Alexander Wuttke
Adrian Rauchfleisch
Andreas Jungherr
35
0
0
02 May 2025
Do We Need a Detailed Rubric for Automated Essay Scoring using Large Language Models?
Lui Yoshida
47
0
0
02 May 2025
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
Piotr Piekos
Róbert Csordás
Jürgen Schmidhuber
MoE
VLM
102
1
0
01 May 2025
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Marco Braga
Pranav Kasela
Alessandro Raganato
G. Pasi
RALM
66
0
0
01 May 2025
Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks
Vishnu Sarukkai
Zhiqiang Xie
Kayvon Fatahalian
LLMAG
75
0
0
01 May 2025
Voice Cloning: Comprehensive Survey
Hussam Azzuni
Abdulmotaleb El Saddik
VLM
44
0
0
01 May 2025
Controllable Weather Synthesis and Removal with Video Diffusion Models
Chih-Hao Lin
Zihan Wang
Ruofan Liang
Yuxuan Zhang
Sanja Fidler
Shenlong Wang
Zan Gojcic
DiffM
VGen
48
0
0
01 May 2025
On the generalization of language models from in-context learning and finetuning: a controlled study
Andrew Kyle Lampinen
Arslan Chaudhry
Stephanie Chan
Cody Wild
Diane Wan
Alex Ku
Jorg Bornschein
Razvan Pascanu
Murray Shanahan
James L. McClelland
46
0
0
01 May 2025
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i
Kola Ayonrinde
Louis Jaburi
MILM
88
1
0
01 May 2025
RayZer: A Self-supervised Large View Synthesis Model
Hanwen Jiang
Hao Tan
Peng Wang
Haian Jin
Yue Zhao
...
Kai Zhang
Fujun Luan
Kalyan Sunkavalli
Qixing Huang
Georgios Pavlakos
68
0
0
01 May 2025
Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models
Sungbok Shin
Hyeon Jeon
Sanghyun Hong
Niklas Elmqvist
161
0
0
01 May 2025
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Muyi Bao
Shuchang Lyu
Zhaoyang Xu
Huiyu Zhou
Jinchang Ren
Shiming Xiang
Xiaomeng Li
Guangliang Cheng
Mamba
87
0
0
01 May 2025
Block Circulant Adapter for Large Language Models
Xinyu Ding
Meiqi Wang
Siyu Liao
Zhongfeng Wang
38
0
0
01 May 2025
Improving Routing in Sparse Mixture of Experts with Graph of Tokens
Tam Minh Nguyen
Ngoc N. Tran
Khai Nguyen
Richard G. Baraniuk
MoE
66
0
0
01 May 2025
CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward Pass
Bowen Zhang
Zixin Song
Chunping Li
29
0
0
01 May 2025
Parameter-Efficient Fine-Tuning with Circulant and Diagonal Vectors
Xinyu Ding
Lexuan Chen
Siyu Liao
Zhongfeng Wang
52
0
0
01 May 2025
Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation
Vaidehi Patil
Yi-Lin Sung
Peter Hase
Jie Peng
Jen-tse Huang
Joey Tianyi Zhou
AAML
MU
83
3
0
01 May 2025
Pre-Training Estimators for Structural Models: Application to Consumer Search
Yanhao 'Max' Wei
Zhenling Jiang
36
0
0
01 May 2025
Fast and Low-Cost Genomic Foundation Models via Outlier Removal
Haozheng Luo
Chenghao Qiu
Maojiang Su
Zhihan Zhou
Zoe Mehta
Guo Ye
Jerry Yao-Chieh Hu
Han Liu
AAML
55
1
0
01 May 2025
Previous
1
2
3
4
5
6
...
213
214
215
Next