Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.12948
Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Xiaokang Zhang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Zijian Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Hairu Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Jianxin Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
R. Wang
Ruoxin Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"
50 / 830 papers shown
Title
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models
Nicolas Baumann
Cheng Hu
Paviththiren Sivasothilingam
Haotong Qin
Lei Xie
Michele Magno
Luca Benini
37
1
0
15 Apr 2025
Mathematical Capabilities of Large Language Models in Finnish Matriculation Examination
Mika Setälä
Pieta Sikström
Ville Heilala
T. Karkkainen
ELM
LRM
38
1
0
15 Apr 2025
Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning
Haiming Wang
Mert Unsal
Xiaohan Lin
Mantas Baksys
Jiaheng Liu
...
Zhouliang Yu
Zhendong Wang
Zhilin Yang
Zhengying Liu
Jia-Nan Li
AIMat
ReLM
AI4TS
LRM
64
8
0
15 Apr 2025
Xpose: Bi-directional Engineering for Hidden Query Extraction
Ahana Pradhan
Jayant Haritsa
24
0
0
15 Apr 2025
Training Small Reasoning LLMs with Cognitive Preference Alignment
Wenrui Cai
Chengyu Wang
Junbing Yan
Jun Huang
Xiangzhong Fang
LRM
29
1
0
14 Apr 2025
PestMA: LLM-based Multi-Agent System for Informed Pest Management
Hongrui Shi
Shunbao Li
Zhipeng Yuan
Po Yang
LLMAG
51
0
0
14 Apr 2025
Can LLMs Classify CVEs? Investigating LLMs Capabilities in Computing CVSS Vectors
Francesco Marchiori
Denis Donadel
Mauro Conti
31
0
0
14 Apr 2025
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Junxiong Wang
Wen-Ding Li
Daniele Paliotta
Daniel Ritter
Alexander M. Rush
Tri Dao
LRM
40
0
0
14 Apr 2025
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao
Isaac Chung
Imene Kerboua
Jamie Stirling
Xin Zhang
Márton Kardos
Roman Solomatin
Noura Al Moubayed
Kenneth C. Enevoldsen
Niklas Muennighoff
VLM
42
0
0
14 Apr 2025
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Yang Shi
Jiaheng Liu
Yushuo Guan
Zhikai Wu
Yuyao Zhang
...
Bohan Zeng
Wei Zhang
Fuzheng Zhang
Wenjing Yang
Di Zhang
VGen
VLM
76
0
0
14 Apr 2025
RealSafe-R1: Safety-Aligned DeepSeek-R1 without Compromising Reasoning Capability
Yuanhang Zhang
Zihao Zeng
Dongbai Li
Yao Huang
Zhijie Deng
Yinpeng Dong
LRM
45
5
0
14 Apr 2025
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users
Xuzhi Zhang
Jiayu Lin
Xinyi Mou
Shiyue Yang
Xiawei Liu
...
Jiebo Luo
Shiping Tang
Libo Wu
Baohua Zhou
Zhongyu Wei
LLMAG
54
3
0
14 Apr 2025
GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents
Run Luo
Lu Wang
Wanwei He
Xiaobo Xia
LLMAG
56
12
0
14 Apr 2025
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
Thilo Hagendorff
Sarah Fabi
ReLM
ELM
LRM
50
0
0
14 Apr 2025
EMAFusion: A Self-Optimizing System for Seamless LLM Selection and Integration
Soham Shah
Kumar Shridhar
Surojit Chatterjee
Souvik Sen
42
0
0
14 Apr 2025
Two Heads are Better Than One: Test-time Scaling of Multi-agent Collaborative Reasoning
Can Jin
Hongwu Peng
Qixin Zhang
Yujin Tang
Dimitris Metaxas
Tong Che
LLMAG
LRM
241
3
0
14 Apr 2025
Heimdall: test-time scaling on the generative verification
Wenlei Shi
Xing Jin
LRM
35
2
0
14 Apr 2025
(How) Do reasoning models reason?
S. Kambhampati
Kaya Stechly
Karthik Valmeekam
ReLM
ELM
LRM
AI4CE
69
1
0
14 Apr 2025
CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography
I-Sheng Fang
Jun-Cheng Chen
LRM
VLM
32
0
0
14 Apr 2025
Reasoning without Regret
Tarun Chitra
OffRL
LRM
45
0
0
14 Apr 2025
Understanding and Optimizing Multi-Stage AI Inference Pipelines
Abhimanyu Bambhaniya
Hanjiang Wu
Suvinay Subramanian
Sudarshan Srinivasan
Souvik Kundu
Amir Yazdanbakhsh
Suvinay Subramanian
Madhu Kumar
Tushar Krishna
237
0
0
14 Apr 2025
Weight Ensembling Improves Reasoning in Language Models
Xingyu Dang
Christina Baek
Kaiyue Wen
Zico Kolter
Aditi Raghunathan
MoMe
LRM
65
1
0
14 Apr 2025
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
Xingjian Zhang
Siwei Wen
Wenjun Wu
Lei Huang
LRM
45
3
0
13 Apr 2025
ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model
Wuyang Lan
Wenzheng Wang
Changwei Ji
Guoxing Yang
Y. Zhang
Xiaohong Liu
Song Wu
Guangyu Wang
LM&MA
ELM
LRM
AI4MH
61
0
0
13 Apr 2025
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training
Zhenting Wang
Guofeng Cui
Kun Wan
Wentian Zhao
40
1
0
13 Apr 2025
Leveraging Reasoning Model Answers to Enhance Non-Reasoning Model Capability
Haotian Wang
Han Zhao
Shuaiting Chen
Xiaoyu Tian
Sitong Zhao
Yunjie Ji
Yiping Peng
Xiangang Li
ReLM
LRM
59
0
0
13 Apr 2025
Speculative Thinking: Enhancing Small-Model Reasoning with Large Model Guidance at Inference Time
Wang Yang
Xiang Yue
V. Chaudhary
Xiaotian Han
ReLM
LRM
78
3
0
12 Apr 2025
PathVLM-R1: A Reinforcement Learning-Driven Reasoning Model for Pathology Visual-Language Tasks
Jian Wu
Hao Yang
Xinhua Zeng
Guibing He
Zhengzhang Chen
Zhu Li
Xinming Zhang
Yangyang Ma
Run Fang
Yang Liu
LRM
187
0
0
12 Apr 2025
A Short Survey on Small Reasoning Models: Training, Inference, Applications and Research Directions
Chengyu Wang
Taolin Zhang
Richang Hong
Jun Huang
ReLM
LRM
50
1
0
12 Apr 2025
SortBench: Benchmarking LLMs based on their ability to sort lists
Steffen Herbold
RALM
LRM
50
0
0
11 Apr 2025
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
Tianwei Xiong
Jun Hao Liew
Zilong Huang
Jiashi Feng
Xihui Liu
41
0
0
11 Apr 2025
ML For Hardware Design Interpretability: Challenges and Opportunities
Raymond Baartmans
Andrew Ensinger
Victor Agostinelli
Lizhong Chen
42
0
0
11 Apr 2025
Playpen: An Environment for Exploring Learning Through Conversational Interaction
Nicola Horst
Davide Mazzaccara
Antonia Schmidt
Michael Sullivan
Filippo Momentè
...
Alexander Koller
Oliver Lemon
David Schlangen
Mario Giulianelli
Alessandro Suglia
OffRL
44
0
0
11 Apr 2025
Large Language Models as Span Annotators
Zdeněk Kasner
Vilém Zouhar
Patrícia Schmidtová
Ivan Kartáč
Kristýna Onderková
Ondřej Plátek
Dimitra Gkatzia
Saad Mahamood
Ondrej Dusek
Simone Balloccu
ALM
47
0
0
11 Apr 2025
The KL3M Data Project: Copyright-Clean Training Resources for Large Language Models
Michael J Bommarito II
Jillian Bommarito
Daniel Martin Katz
AILaw
63
0
0
10 Apr 2025
DeepGreen: Effective LLM-Driven Green-washing Monitoring System Designed for Empirical Testing -- Evidence from China
Congluo Xu
Yu Miao
Yiling Xiao
Chengmengjia Lin
31
0
0
10 Apr 2025
Kimi-VL Technical Report
Kimi Team
Angang Du
B. Yin
Bowei Xing
Bowen Qu
...
Zhiqi Huang
Zihao Huang
Zijia Zhao
Zhengzhang Chen
Zongyu Lin
MLLM
VLM
MoE
219
5
0
10 Apr 2025
Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining
Rosie Zhao
Alexandru Meterez
Sham Kakade
Cengiz Pehlevan
Samy Jelassi
Eran Malach
ReLM
LRM
201
2
0
10 Apr 2025
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable Metric
Yixin Cao
Jiahao Ying
Yansen Wang
Xipeng Qiu
Xuanjing Huang
Yugang Jiang
ELM
46
2
0
10 Apr 2025
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Hardy Chen
Haoqin Tu
Fali Wang
Hui Liu
Xianfeng Tang
Xinya Du
Yuyin Zhou
Cihang Xie
ReLM
VLM
OffRL
LRM
77
9
0
10 Apr 2025
AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time Computation
Tuhin Chakrabarty
Philippe Laban
C. Wu
39
1
0
10 Apr 2025
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang
Chao Qu
Zuming Huang
Xiaoyu Tan
Fangzhen Lin
Wenhu Chen
OffRL
ReLM
SyDa
LRM
VLM
83
2
0
10 Apr 2025
VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning
Yukun Qi
Yiming Zhao
Y. Zeng
Xikun Bao
Wenjie Huang
Lin Yen-Chen
Zehui Chen
Jie Zhao
Zhongang Qi
Feng Zhao
LRM
52
1
0
10 Apr 2025
Enhancing Player Enjoyment with a Two-Tier DRL and LLM-Based Agent System for Fighting Games
Shouren Wang
Zehua Jiang
Fernando Sliva
Sam Earle
Julian Togelius
28
0
0
10 Apr 2025
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
Xinze Wang
Zheng Yang
Chao Feng
Hongjin Lu
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
Furong Huang
Lijuan Wang
OODD
ReLM
VLM
LRM
74
3
0
10 Apr 2025
Automating quantum feature map design via large language models
Kenya Sakka
K. Mitarai
Keisuke Fujii
38
2
0
10 Apr 2025
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
Haozhan Shen
Peng Liu
Jianxin Li
Chunxin Fang
Yibo Ma
...
Zilun Zhang
Kangjia Zhao
Qianqian Zhang
Ruochen Xu
Tiancheng Zhao
VLM
LRM
76
36
0
10 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLM
ALM
LRM
100
9
0
09 Apr 2025
MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep Thinking
Chang Nie
Yiqing Xu
Guangming Wang
Zhe Liu
Yanzi Miao
Hesheng Wang
VLM
43
0
0
09 Apr 2025
Holistic Capability Preservation: Towards Compact Yet Comprehensive Reasoning Models
Ling Team
Caizhi Tang
Chilin Fu
Chunwei Wu
Jia Guo
...
Shuaicheng Li
Wenjie Qu
Yingting Wu
Y. Liu
Zhenyu Huang
LRM
38
0
0
09 Apr 2025
Previous
1
2
3
...
7
8
9
...
15
16
17
Next