Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.12948
Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Yanling Wang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Zijian Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Han Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Junlong Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
Rongpin Wang
Ruoxin Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"
50 / 1,327 papers shown
Title
EvolvTrip: Enhancing Literary Character Understanding with Temporal Theory-of-Mind Graphs
Bohao Yang
Hainiu Xu
Jinhua Du
Ze Li
Yulan He
Chenghua Lin
47
0
0
16 Jun 2025
FinLMM-R1: Enhancing Financial Reasoning in LMM through Scalable Data and Reward Design
Kai Lan
Jiayong Zhu
Jiangtong Li
Dawei Cheng
Guang-Sheng Chen
Changjun Jiang
LRM
36
0
0
16 Jun 2025
ExtendAttack: Attacking Servers of LRMs via Extending Reasoning
Zhenhao Zhu
Yue Liu
Yingwei Ma
Hongcheng Gao
Nuo Chen
Yanpei Guo
Wenjie Qu
Huiying Xu
Xinzhong Zhu
Jiaheng Zhang
AAML
LRM
40
0
0
16 Jun 2025
Multipole Attention for Efficient Long Context Reasoning
Coleman Hooper
Sebastian Zhao
Luca Manolache
Sehoon Kim
Michael W. Mahoney
Y. Shao
Kurt Keutzer
Amir Gholami
OffRL
LRM
35
0
0
16 Jun 2025
Discrete Diffusion in Large Language and Multimodal Models: A Survey
Runpeng Yu
Qi Li
Xinchao Wang
DiffM
AI4CE
61
0
0
16 Jun 2025
Position: Pause Recycling LoRAs and Prioritize Mechanisms to Uncover Limits and Effectiveness
Mei-Yen Chen
Thi Thu Uyen Hoang
Michael Hahn
M. Sarfraz
MoMe
35
0
0
16 Jun 2025
xbench: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations
Kaiyuan Chen
Y. Ren
Yang Liu
Xiaobo Hu
Haotong Tian
...
Yuan Jiang
Zexuan Liu
Zihan Yin
Zijian Ma
Zhiwen Mo
53
0
0
16 Jun 2025
Document-Level Tabular Numerical Cross-Checking: A Coarse-to-Fine Approach
Chaoxu Pang
Yixuan Cao
Ganbin Zhou
Hongwei Bran Li
Ping Luo
LMTD
52
0
0
16 Jun 2025
Cross-architecture universal feature coding via distribution alignment
Changsheng Gao
Shan Liu
Feng Wu
Weisi Lin
OOD
9
0
0
15 Jun 2025
SPECS
\texttt{SPECS}
SPECS
: Faster Test-Time Scaling through Speculative Drafts
Mert Cemri
Nived Rajaraman
Rishabh Tiwari
Xiaoxuan Liu
Kurt Keutzer
Ion Stoica
Kannan Ramchandran
Ahmad Beirami
Ziteng Sun
LRM
29
0
0
15 Jun 2025
Reasoning Model Unlearning: Forgetting Traces, Not Just Answers, While Preserving Reasoning Skills
Changsheng Wang
Chongyu Fan
Yihua Zhang
Jinghan Jia
Dennis Wei
Parikshit Ram
Nathalie Baracaldo
Sijia Liu
MU
KELM
LRM
65
0
0
15 Jun 2025
QiMeng-Attention: SOTA Attention Operator is generated by SOTA Attention Algorithm
Qirui Zhou
Shaohui Peng
Weiqiang Xiong
Haixin Chen
Yuanbo Wen
...
Ke Gao
Ruizhi Chen
Yanjun Wu
Chen Zhao
Y. Chen
LRM
37
0
0
14 Jun 2025
Bridging the Digital Divide: Small Language Models as a Pathway for Physics and Photonics Education in Underdeveloped Regions
Asghar Ghorbani
Hanieh Fattahi
46
0
0
14 Jun 2025
Performance Plateaus in Inference-Time Scaling for Text-to-Image Diffusion Without External Models
Changhyun Choi
S. Kim
H. Jin Kim
DiffM
28
0
0
14 Jun 2025
Efficient Reasoning Through Suppression of Self-Affirmation Reflections in Large Reasoning Models
Kaiyuan Liu
Chen Shen
Zhanwei Zhang
Junjie Liu
Xiaosong Yuan
Jieping Ye
ReLM
LRM
57
0
0
14 Jun 2025
Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics
Asifullah Khan
Muhammad Zaeem Khan
Saleha Jamshed
Sadia Ahmad
Aleesha Zainab
Kaynat Khatib
Faria Bibi
Abdul Rehman
OffRL
LRM
42
0
0
14 Jun 2025
Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning
Sara Rajaram
R. J. Cotton
Fabian H. Sinz
29
0
0
14 Jun 2025
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
Jiachen Yu
Yufei Zhan
Ziheng Wu
Yousong Zhu
Jinqiao Wang
Minghui Qiu
VLM
LRM
36
0
0
13 Jun 2025
Prioritizing Alignment Paradigms over Task-Specific Model Customization in Time-Series LLMs
Wei Li
Yunyao Cheng
Xinli Hao
Chaohong Ma
Yuxuan Liang
Bin Yang
Christian S.Jensen
Xiaofeng Meng
AI4TS
47
0
0
13 Jun 2025
Bias Amplification in RAG: Poisoning Knowledge Retrieval to Steer LLMs
Linlin Wang
Tianqing Zhu
Laiqiao Qin
Longxiang Gao
Wanlei Zhou
31
0
0
13 Jun 2025
TongSearch-QR: Reinforced Query Reasoning for Retrieval
Xubo Qin
Jun Bai
Jiaqi Li
Zixia Jia
Zilong Zheng
ReLM
RALM
LRM
61
0
0
13 Jun 2025
RAG+: Enhancing Retrieval-Augmented Generation with Application-Aware Reasoning
Yu Wang
Shiwan Zhao
Ming Fan
Zhihu Wang
Y. Zhang
Xicheng Zhang
Zhengfan Wang
Heyuan Huang
Ting Liu
VLM
LRM
45
0
0
13 Jun 2025
Schema-R1: A reasoning training approach for schema linking in Text-to-SQL Task
Wuzhenghong Wen
Su Pan
yuwei Sun
ReLM
LRM
78
0
0
13 Jun 2025
From Emergence to Control: Probing and Modulating Self-Reflection in Language Models
Xudong Zhu
Jiachen Jiang
Mohammad Mahdi Khalili
Zhihui Zhu
ReLM
LM&Ro
LRM
65
0
0
13 Jun 2025
LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
Yanan Cai
Ahmed Salem
Besmira Nushi
M. Russinovich
LLMAG
LRM
136
0
0
12 Jun 2025
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
Jiashuo Yu
Y. Wu
Meng Chu
Zhifei Ren
Z. Huang
...
Conghui He
Yu Qiao
Yali Wang
Yi Wang
L. Wang
LRM
140
0
0
12 Jun 2025
Self-Adapting Language Models
Adam Zweiger
Jyothish Pari
Han Guo
Ekin Akyürek
Yoon Kim
Pulkit Agrawal
KELM
LRM
155
0
0
12 Jun 2025
WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models
Qiyue Yin
Pei Xu
Qiaozhe Li
Shengda Liu
S. Shen
...
Lei Cui
Chengxin Yan
Jie Sun
Xiangquan Tang
K. Huang
LLMAG
ELM
LRM
124
0
0
12 Jun 2025
Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs
Yucong Luo
Yitong Zhou
Mingyue Cheng
Jiahao Wang
Daoyu Wang
Tingyue Pan
Jintao Zhang
AI4TS
LRM
129
0
0
12 Jun 2025
VideoDeepResearch: Long Video Understanding With Agentic Tool Using
Huaying Yuan
Zheng Liu
Junjie Zhou
Ji-Rong Wen
Ji-Rong Wen
Zhicheng Dou
VLM
139
0
0
12 Jun 2025
Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles
Qingyan Wei
Y. Zhang
Zhiyuan Liu
Dongrui Liu
Linfeng Zhang
DiffM
AI4CE
159
0
0
12 Jun 2025
Discovering Hierarchical Latent Capabilities of Language Models via Causal Representation Learning
Jikai Jin
Vasilis Syrgkanis
Sham Kakade
Hanlin Zhang
ELM
142
1
0
12 Jun 2025
Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving
Luke Rowe
Rodrigue de Schaetzen
Roger Girgis
C. Pal
Liam Paull
MLLM
VLM
36
0
0
12 Jun 2025
OPT-BENCH: Evaluating LLM Agent on Large-Scale Search Spaces Optimization Problems
Xiaozhe Li
Jixuan Chen
Xinyu Fang
Shengyuan Ding
Haodong Duan
Qingwen Liu
Kai-xiang Chen
LLMAG
LRM
120
0
0
12 Jun 2025
TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving
Vincenzo Colle
Mohamed Sana
Nicola Piovesan
A. De Domenico
Fadhel Ayed
Merouane Debbah
85
0
0
12 Jun 2025
Provably Learning from Language Feedback
Wanqiao Xu
Allen Nie
Ruijie Zheng
Aditya Modi
Adith Swaminathan
Ching-An Cheng
166
0
0
12 Jun 2025
Reasoning RAG via System 1 or System 2: A Survey on Reasoning Agentic Retrieval-Augmented Generation for Industry Challenges
Jintao Liang
Gang Su
Huifeng Lin
You Wu
Rui Zhao
Ziyue Li
3DV
LRM
143
0
0
12 Jun 2025
OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics
Yaoming Zhu
Junxin Wang
Yiyang Li
Lin Qiu
Zongyu Wang
...
Xuezhi Cao
Yuhuai Wei
Mingshi Wang
Xunliang Cai
Rong Ma
LRM
131
0
0
12 Jun 2025
PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier
Y. Jiang
Yuwen Xiong
Yufeng Yuan
Chao Xin
Wenyuan Xu
Yu Yue
Qianchuan Zhao
Lin Yan
LRM
135
0
0
12 Jun 2025
LLM Embedding-based Attribution (LEA): Quantifying Source Contributions to Generative Model's Response for Vulnerability Analysis
Reza Fayyazi
Michael Zuzak
S. Yang
39
0
0
12 Jun 2025
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
Zijie Wu
Chaohui Yu
Fan Wang
Xiang Bai
AI4CE
65
0
0
11 Jun 2025
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning
Yu Sun
Xingyu Qian
Weiwen Xu
Hao Zhang
Chenghao Xiao
Long Li
Yu Rong
Wenbing Huang
Qifeng Bai
Tingyang Xu
LRM
79
0
0
11 Jun 2025
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
Xiyao Wang
Zhengyuan Yang
Chao Feng
Yongyuan Liang
Yuhang Zhou
...
Chung-Ching Lin
Kevin Lin
Linjie Li
Furong Huang
L. xilinx Wang
OffRL
LRM
73
0
0
11 Jun 2025
Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models
Shuai Wang
Zhenhua Liu
Jiaheng Wei
Xuanwu Yin
Dong Li
E. Barsoum
LRM
92
0
0
11 Jun 2025
CoRT: Code-integrated Reasoning within Thinking
Chengpeng Li
Zhengyang Tang
Ziniu Li
Mingfeng Xue
Keqin Bao
...
Ruoyu Sun
Benyou Wang
Xiang Wang
Junyang Lin
Dayiheng Liu
LLMAG
OffRL
ReLM
LRM
85
0
0
11 Jun 2025
ComfyUI-R1: Exploring Reasoning Models for Workflow Generation
Zhenran Xu
Yiyu Wang
Xue Yang
Longyue Wang
Weihua Luo
Kaifu Zhang
Baotian Hu
Min Zhang
AI4TS
LRM
85
0
0
11 Jun 2025
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Xinyu Yang
Yuwei An
Hongyi Liu
Tianqi Chen
Beidi Chen
SyDa
LRM
189
0
0
11 Jun 2025
3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks
Xiaotang Gai
Jiaxiang Liu
Yichen Li
Zijie Meng
Jian Wu
Zuozhu Liu
VGen
27
0
0
11 Jun 2025
Vision Matters: Simple Visual Perturbations Can Boost Multimodal Math Reasoning
Yuting Li
Lai Wei
Kaipeng Zheng
Jingyuan Huang
Linghe Kong
Lichao Sun
Weiran Huang
AAML
LRM
VLM
89
0
0
11 Jun 2025
Mitigating Spurious Correlations in LLMs via Causality-Aware Post-Training
Shurui Gui
Shuiwang Ji
LRM
83
0
0
11 Jun 2025
Previous
1
2
3
4
5
...
25
26
27
Next