Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.14165
Cited By
v1
v2
v3
v4 (latest)
Language Models are Few-Shot Learners
28 May 2020
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
Prafulla Dhariwal
Arvind Neelakantan
Pranav Shyam
Girish Sastry
Amanda Askell
Sandhini Agarwal
Ariel Herbert-Voss
Gretchen Krueger
T. Henighan
R. Child
Aditya A. Ramesh
Daniel M. Ziegler
Jeff Wu
Clemens Winter
Christopher Hesse
Mark Chen
Eric Sigler
Ma-teusz Litwin
Scott Gray
B. Chess
Jack Clark
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Language Models are Few-Shot Learners"
50 / 1,609 papers shown
Title
Southern Newswire Corpus: A Large-Scale Dataset of Mid-Century Wire Articles Beyond the Front Page
Michael McRae
AI4CE
160
0
0
17 Feb 2025
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving
Xin Xu
Yan Xu
Tianhao Chen
Yuchen Yan
Chengwu Liu
...
Yansen Wang
Yichun Yin
Yijiao Wang
Lifeng Shang
Qiang Liu
LRM
111
3
0
17 Feb 2025
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
Yige Xu
Xu Guo
Zhiwei Zeng
Chunyan Miao
LLMAG
CLL
LRM
121
18
0
17 Feb 2025
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
Zeqing Wang
Wentao Wan
Qiqing Lao
Runmeng Chen
Minjie Lang
Keze Wang
Liang Lin
Liang Lin
LRM
202
3
0
17 Feb 2025
Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?
Zichen Wen
Yifeng Gao
Weijia Li
Conghui He
Linfeng Zhang
LRM
121
3
0
17 Feb 2025
Zero-shot Model-based Reinforcement Learning using Large Language Models
Abdelhakim Benechehab
Youssef Attia El Hili
Ambroise Odonnat
Oussama Zekri
Albert Thomas
Giuseppe Paolo
Maurizio Filippone
I. Redko
Balázs Kégl
OffRL
115
1
0
17 Feb 2025
Associative Recurrent Memory Transformer
Ivan Rodkin
Yuri Kuratov
Aydar Bulatov
Andrey Kravchenko
126
3
0
17 Feb 2025
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning
Lefei Zhang
Lijie Hu
Di Wang
LRM
163
4
0
17 Feb 2025
Detecting Phishing Sites Using ChatGPT
Takashi Koide
Naoki Fukushi
Hiroki Nakano
Daiki Chiba
125
31
0
17 Feb 2025
Empirical evaluation of LLMs in predicting fixes of Configuration bugs in Smart Home System
Sheikh Moonwara Anjum Monisha
Atul Bharadwaj
90
0
0
16 Feb 2025
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
Shahriar Kabir Nahin
R. N. Nandi
Sagor Sarker
Quazi Sarwar Muhtaseem
Md. Kowsher
Apu Chandraw Shill
Md Ibrahim
Mehadi Hasan Menon
Tareq Al Muntasir
Firoj Alam
159
0
0
16 Feb 2025
Smoothing Out Hallucinations: Mitigating LLM Hallucination with Smoothed Knowledge Distillation
Hieu Nguyen
Zihao He
Shoumik Atul Gandre
Ujjwal Pasupulety
Sharanya Kumari Shivakumar
Kristina Lerman
HILM
111
2
0
16 Feb 2025
RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation
Pengcheng Jiang
Lang Cao
Ruike Zhu
Minhao Jiang
Yunyi Zhang
Jimeng Sun
Jiawei Han
RALM
186
4
0
16 Feb 2025
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang
Caigao Jiang
Zhaoyi Li
Siqiao Xue
Jun-ping Zhou
Linqi Song
Defu Lian
Yin Wei
CLL
MU
136
1
0
16 Feb 2025
Divergent Thoughts toward One Goal: LLM-based Multi-Agent Collaboration System for Electronic Design Automation
Haoyuan Wu
Haisheng Zheng
Zhuolun He
Bei Yu
90
1
0
15 Feb 2025
A Self-Supervised Reinforcement Learning Approach for Fine-Tuning Large Language Models Using Cross-Attention Signals
Andrew Kiruluta
Andreas Lemos
Priscilla Burity
143
3
0
14 Feb 2025
Large Language Diffusion Models
Shen Nie
Fengqi Zhu
Zebin You
Xiaolu Zhang
Jingyang Ou
Jun Hu
Jun Zhou
Yankai Lin
Ji-Rong Wen
Chongxuan Li
218
48
0
14 Feb 2025
Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias
Enzo Doyen
Amalia Todirascu
82
0
0
14 Feb 2025
Analyzing Patient Daily Movement Behavior Dynamics Using Two-Stage Encoding Model
Jin Cui
Alexander Capstick
Payam Barnaghi
Gregory Scott
88
0
0
14 Feb 2025
LLM-Enhanced Multiple Instance Learning for Joint Rumor and Stance Detection with Social Context Information
Ruichao Yang
Jing Ma
Wei Gao
Hongzhan Lin
124
0
0
13 Feb 2025
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models
Daniel Fleischer
Moshe Berchansky
Gad Markovits
Moshe Wasserblat
ReLM
ELM
LRM
143
0
0
13 Feb 2025
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding
Mo Yu
Lemao Liu
J. Wu
Tsz Ting Chung
Shunchi Zhang
JiangNan Li
Dit-Yan Yeung
Jie Zhou
178
1
0
13 Feb 2025
Object-Centric Latent Action Learning
Albina Klepach
Alexander Nikulin
Ilya Zisman
Denis Tarasov
Alexander Derevyagin
Andrei Polubarov
Nikita Lyubaykin
Vladislav Kurenkov
113
0
0
13 Feb 2025
Typhoon T1: An Open Thai Reasoning Model
Pittawat Taveekitworachai
Potsawee Manakul
Kasima Tharnpipitchai
Kunat Pipatanakul
OffRL
LRM
186
0
0
13 Feb 2025
Communication is All You Need: Persuasion Dataset Construction via Multi-LLM Communication
Weicheng Ma
Hefan Zhang
Ivory Yang
Shiyu Ji
Joice Chen
...
Shubham Mohole
Ethan Gearey
Michael Macy
Saeed Hassanpour
Soroush Vosoughi
95
2
0
13 Feb 2025
Two-Stage Representation Learning for Analyzing Movement Behavior Dynamics in People Living with Dementia
Jin Cui
Alexander Capstick
Payam Barnaghi
Gregory Scott
103
0
0
13 Feb 2025
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
Siddharth Singh
Prajwal Singhania
Aditya K. Ranjan
John Kirchenbauer
Jonas Geiping
...
Abhimanyu Hans
Manli Shu
Aditya Tomar
Tom Goldstein
A. Bhatele
166
3
0
12 Feb 2025
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
273
7
0
12 Feb 2025
Enhancing LLM Character-Level Manipulation via Divide and Conquer
Zhen Xiong
Yujun Cai
Bryan Hooi
Nanyun Peng
Kai-Wei Chang
Zhecheng Li
128
0
0
12 Feb 2025
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification
Jiangbo Shi
Chen Li
Tieliang Gong
Yefeng Zheng
Huazhu Fu
VLM
154
11
0
12 Feb 2025
From PowerPoint UI Sketches to Web-Based Applications: Pattern-Driven Code Generation for GIS Dashboard Development Using Knowledge-Augmented LLMs, Context-Aware Visual Prompting, and the React Framework
Haowen Xu
Xiao-Ying Yu
128
1
0
12 Feb 2025
TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book Data
Leonardo Berti
Gjergji Kasneci
AI4TS
160
1
0
12 Feb 2025
When More is Less: Understanding Chain-of-Thought Length in LLMs
Yuyang Wu
Yifei Wang
Tianqi Du
Stefanie Jegelka
Yisen Wang
Yisen Wang
LRM
140
40
0
11 Feb 2025
TRAVEL: Training-Free Retrieval and Alignment for Vision-and-Language Navigation
Navid Rajabi
Jana Kosecka
LM&Ro
3DV
118
0
0
11 Feb 2025
URECA: The Chain of Two Minimum Set Cover Problems exists behind Adaptation to Shifts in Semantic Code Search
Seok-Ung Choi
Joonghyuk Hahn
Yo-Sub Han
91
0
0
11 Feb 2025
A Large-Scale Benchmark for Vietnamese Sentence Paraphrases
Sang Quang Nguyen
Kiet Van Nguyen
125
0
0
11 Feb 2025
LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation
Zican Dong
Junyi Li
Jinhao Jiang
Mingyu Xu
Wayne Xin Zhao
Bin Wang
Xin Wu
VLM
321
5
0
11 Feb 2025
InSTA: Towards Internet-Scale Training For Agents
Brandon Trabucco
Gunnar Sigurdsson
Robinson Piramuthu
Ruslan Salakhutdinov
ALM
166
4
0
10 Feb 2025
Do we really have to filter out random noise in pre-training data for language models?
Jinghan Ru
Yuxin Xie
Xianwei Zhuang
Yuguo Yin
Zhihui Guo
Zhiming Liu
Qianli Ren
Yuexian Zou
176
5
0
10 Feb 2025
LegalViz: Legal Text Visualization by Text To Diagram Generation
Eri Onami
Taiki Miyanishi
Koki Maeda
Shuhei Kurita
AILaw
96
1
0
10 Feb 2025
Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis
Sanket Jantre
Tianle Wang
Gilchan Park
Kriti Chopra
Nicholas Jeon
Xiaoning Qian
Nathan M. Urban
Byung-Jun Yoon
165
0
0
10 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
178
1
0
10 Feb 2025
LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models
Sihwan Park
Doohyuk Jang
Sungyub Kim
Souvik Kundu
Eunho Yang
123
0
0
10 Feb 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Naome A. Etori
Maria Gini
152
3
0
10 Feb 2025
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation
Saurabh Kumar Pandey
S. Vashistha
Debrup Das
Somak Aditya
Monojit Choudhury
AAML
134
0
0
10 Feb 2025
CORRECT: Context- and Reference-Augmented Reasoning and Prompting for Fact-Checking
Delvin Ce Zhang
Dongwon Lee
HILM
LRM
145
0
0
09 Feb 2025
Uni-Retrieval: A Multi-Style Retrieval Framework for STEM's Education
Yanhao Jia
Xinyi Wu
Hao Li
Qinglin Zhang
Yuxiao Hu
Shuai Zhao
Wenqi Fan
152
4
0
09 Feb 2025
Reinforced Lifelong Editing for Language Models
Zherui Li
Houcheng Jiang
Hao Chen
Baolong Bi
Zhenhong Zhou
Fei Sun
Sihang Li
Xinze Wang
KELM
111
7
0
09 Feb 2025
Few-shot LLM Synthetic Data with Distribution Matching
Jiyuan Ren
Zhaocheng Du
Zhihao Wen
Qinglin Jia
Sunhao Dai
Chuhan Wu
Zhenhua Dong
SyDa
165
0
0
09 Feb 2025
MoFM: A Large-Scale Human Motion Foundation Model
Mohammadreza Baharani
Ghazal Alinezhad Noghre
Armin Danesh Pazho
Gabriel Maldonado
Hamed Tabkhi
AI4CE
435
1
0
08 Feb 2025
Previous
1
2
3
...
8
9
10
...
31
32
33
Next