Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.12948
Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Xiaokang Zhang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Zijian Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Han Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Junlong Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
R. Wang
Ruoxin Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"
50 / 830 papers shown
Title
Statics of continuum planar grasping
Udit Halder
36
0
0
03 Apr 2025
Noiser: Bounded Input Perturbations for Attributing Large Language Models
Mohammad Reza Ghasemi Madani
Aryo Pradipta Gema
Gabriele Sarti
Yu Zhao
Pasquale Minervini
Andrea Passerini
AAML
40
0
0
03 Apr 2025
F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization
Xiaohui Sun
Ruitong Xiao
Jianye Mo
Bowen Wu
Qun Yu
Baoxun Wang
53
1
0
03 Apr 2025
MegaMath: Pushing the Limits of Open Math Corpora
Fan Zhou
Zengzhi Wang
Nikhil Ranjan
Zhoujun Cheng
Liping Tang
Guowei He
Zhengzhong Liu
Eric P. Xing
LRM
56
1
0
03 Apr 2025
Beyond Accuracy: The Role of Calibration in Self-Improving Large Language Models
Liangjie Huang
Dawei Li
Huan Liu
Lu Cheng
LRM
44
0
0
03 Apr 2025
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving
Daoguang Zan
Zhirong Huang
Wei Liu
Hanwu Chen
L. Zhang
...
Jing Su
Tianyu Liu
Rui Long
Kai Shen
Liang Xiang
56
2
0
03 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
74
1
0
03 Apr 2025
Affordable AI Assistants with Knowledge Graph of Thoughts
Maciej Besta
Lorenzo Paleari
Jia Hao Andrea Jiang
Robert Gerstenberger
You Wu
...
Jón Gunnar Hannesson
Grzegorz Kwa'sniewski
Marcin Copik
H. Niewiadomski
Torsten Hoefler
LLMAG
RALM
254
0
0
03 Apr 2025
AnesBench: Multi-Dimensional Evaluation of LLM Reasoning in Anesthesiology
Xiang Feng
Wentao Jiang
Zengmao Wang
Yong Luo
Pingbo Xu
Baosheng Yu
Hua Jin
Bo Du
Jing Zhang
ELM
LRM
48
0
0
03 Apr 2025
Understanding Aha Moments: from External Observations to Internal Mechanisms
Shu Yang
Junchao Wu
Xin Chen
Yunze Xiao
Xinyi Yang
Derek F. Wong
Di Wang
LRM
38
2
0
03 Apr 2025
Inference-Time Scaling for Generalist Reward Modeling
Zijun Liu
P. Wang
Ran Xu
Shirong Ma
Chong Ruan
Ziwei Sun
Yang Liu
Y. Wu
OffRL
LRM
56
21
0
03 Apr 2025
How Deep Do Large Language Models Internalize Scientific Literature and Citation Practices?
Andres Algaba
Vincent Holst
Floriano Tori
Melika Mobini
Brecht Verbeken
Sylvia Wenmackers
Vincent Ginis
54
1
0
03 Apr 2025
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Yi-Long Lu
Chunhui Zhang
Jiajun Song
Lifeng Fan
Wei Wang
OffRL
53
0
0
02 Apr 2025
Watermarking for AI Content Detection: A Review on Text, Visual, and Audio Modalities
Lele Cao
46
0
0
02 Apr 2025
Chain of Correction for Full-text Speech Recognition with Large Language Models
Zhiyuan Tang
Dong Wang
Zhikai Zhou
Y. Liu
Shen Huang
Shri Kiran Srinivasan
KELM
61
0
0
02 Apr 2025
RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics
Zhiyuan Zhang
Yuxin He
Yong Sun
Junyu Shi
Lijiang Liu
Qiang Nie
VLM
54
0
0
02 Apr 2025
Emerging Cyber Attack Risks of Medical AI Agents
Jianing Qiu
Lin Li
Jiankai Sun
Hao Wei
Zhe Xu
K. Lam
Wu Yuan
AAML
35
2
0
02 Apr 2025
ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning
Bairu Hou
Yang Zhang
Jiabao Ji
Yujian Liu
Kaizhi Qian
Jacob Andreas
Shiyu Chang
OffRL
LRM
67
9
0
02 Apr 2025
Reasoning LLMs for User-Aware Multimodal Conversational Agents
Hamed Rahimi
Jeanne Cattoni
Meriem Beghili
Mouad Abrini
Mahdi Khoramshahi
Maribel Pino
Mohamed Chetouani
LRM
36
1
0
02 Apr 2025
Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish
Cedric Lothritz
Jordi Cabot
43
0
0
02 Apr 2025
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
Nan Zhang
Yusen Zhang
Prasenjit Mitra
Rui Zhang
MQ
LRM
72
2
0
02 Apr 2025
Scaling Test-time Compute for Low-resource Languages: Multilingual Reasoning in LLMs
Khanh-Tung Tran
Barry O'Sullivan
Hoang D. Nguyen
LRM
42
2
0
02 Apr 2025
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sakhinana Sagar Srinivas
Akash Das
Shivam Gupta
Venkataramana Runkana
OffRL
58
1
0
02 Apr 2025
GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
Enjun Du
Xunkai Li
Tian Jin
Zhihan Zhang
Rong-Hua Li
Guoren Wang
48
0
0
01 Apr 2025
Z1: Efficient Test-time Scaling with Code
Zhaojian Yu
Yinghao Wu
Yilun Zhao
Arman Cohan
Xiao-Ping Zhang
LRM
37
6
0
01 Apr 2025
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Jianhao Chen
Zishuo Xun
Bocheng Zhou
Han Qi
Qiaosheng Zhang
...
Wei Hu
Yuzhong Qu
W. Ouyang
Wanli Ouyang
Shuyue Hu
74
1
0
01 Apr 2025
Neural Approaches to SAT Solving: Design Choices and Interpretability
David Mojžíšek
Jan Hůla
Ziwei Li
Ziyu Zhou
Mikoláš Janota
AAML
NAI
41
0
0
01 Apr 2025
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench
Ziyi Liu
Priyanka Dey
Zhenyu Zhao
Jen-tse Huang
Rahul Gupta
Yong-Jin Liu
Jieyu Zhao
41
0
0
01 Apr 2025
Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion
Dongjun Wei
Minjia Mao
Xiao Fang
Michael Chau
DeLMO
53
0
0
01 Apr 2025
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
Jian Zhao
Runze Liu
Kaiyan Zhang
Zhimu Zhou
Junqi Gao
...
Jiafei Lyu
Zhouyi Qian
Biqing Qi
Xiu Li
Bowen Zhou
OffRL
LRM
47
6
0
01 Apr 2025
VNJPTranslate: A comprehensive pipeline for Vietnamese-Japanese translation
Hoang Hai Phan
Nguyen Duc Minh Vu
Nam Dang Phuong
48
0
0
01 Apr 2025
Improved Visual-Spatial Reasoning via R1-Zero-Like Training
Zhenyi Liao
Qingsong Xie
Yanhao Zhang
Zijian Kong
Haonan Lu
Zhenyu Yang
Zhijie Deng
ReLM
VLM
LRM
112
1
1
01 Apr 2025
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
Nishad Singhi
Hritik Bansal
Arian Hosseini
Aditya Grover
Kai-Wei Chang
Marcus Rohrbach
Anna Rohrbach
OffRL
LRM
50
2
0
01 Apr 2025
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Kai Yan
Yufei Xu
Zhengyin Du
Xuesong Yao
Ziyi Wang
Xiaowen Guo
Jiecao Chen
ReLM
ELM
LRM
95
4
0
01 Apr 2025
Hawkeye:Efficient Reasoning with Model Collaboration
Jianshu She
Z. Li
Zhemin Huang
Qi Li
Peiran Xu
Haonan Li
Qirong Ho
LRM
60
3
0
01 Apr 2025
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Juncheng Wu
Wenlong Deng
Xiaochen Li
Sheng Liu
Taomian Mi
...
Yihan Cao
Hui Ren
Xuzhao Li
Xiaoxiao Li
Yuyin Zhou
AI4MH
LRM
66
4
0
01 Apr 2025
JudgeLRM: Large Reasoning Models as a Judge
Nuo Chen
Zhiyuan Hu
Qingyun Zou
Jiaying Wu
Qian Wang
Bryan Hooi
Bingsheng He
ReLM
ELM
LRM
75
9
0
31 Mar 2025
Boosting MLLM Reasoning with Text-Debiased Hint-GRPO
Qihan Huang
Long Chan
Jinlong Liu
Wanggui He
Hao Jiang
Mingli Song
Jingyuan Chen
Chang Yao
Jie Song
LRM
37
0
0
31 Mar 2025
Can Test-Time Scaling Improve World Foundation Model?
Wenyan Cong
Hanqing Zhu
Peihao Wang
Bangya Liu
Dejia Xu
Kevin Wang
David Z. Pan
Yan Wang
Zhiwen Fan
Ziyi Wang
41
0
0
31 Mar 2025
Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning
J. Lin
Tian Wang
Kun Qian
LRM
52
4
0
31 Mar 2025
CrowdVLM-R1: Expanding R1 Ability to Vision Language Model for Crowd Counting using Fuzzy Group Relative Policy Reward
Zhiqiang Wang
Pengbin Feng
Yanbin Lin
Shuzhang Cai
Zongao Bian
Jinghua Yan
Xingquan Zhu
36
1
0
31 Mar 2025
Adaptive Layer-skipping in Pre-trained LLMs
Xuan Luo
Weizhi Wang
Xifeng Yan
245
0
0
31 Mar 2025
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
Vidhisha Balachandran
Jingya Chen
Lingjiao Chen
Shivam Garg
Neel Joshi
...
John Langford
Besmira Nushi
Vibhav Vineet
Yue Wu
Safoora Yousefi
ReLM
LRM
64
4
0
31 Mar 2025
Large Language Models in Numberland: A Quick Test of Their Numerical Reasoning Abilities
Roussel Rahman
ReLM
ELM
LRM
48
0
0
31 Mar 2025
LLM4FS: Leveraging Large Language Models for Feature Selection and How to Improve It
Jianhao Li
Xianchao Xiu
49
0
0
31 Mar 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Zhiyu Li
Lefei Zhang
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
62
0
0
31 Mar 2025
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Yi Chen
Yuying Ge
Rui Wang
Yixiao Ge
Lu Qiu
Ying Shan
Xihui Liu
ReLM
VLM
OffRL
LRM
69
2
0
31 Mar 2025
HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment
Zhichao Liao
Xiaokun Liu
Wenyu Qin
Qingyu Li
Qiulin Wang
Pengfei Wan
Di Zhang
Long Zeng
Pingfa Feng
62
0
0
31 Mar 2025
SQuat: Subspace-orthogonal KV Cache Quantization
Hao Wang
Ligong Han
Kai Xu
Akash Srivastava
MQ
56
0
0
31 Mar 2025
RARE: Retrieval-Augmented Reasoning Modeling
Zhengren Wang
Jiayang Yu
Dongsheng Ma
Zhengzhang Chen
Yu Wang
...
Zhiyu Li
Yanfeng Wang
Weinan E
Linpeng Tang
Feiyu Xiong
RALM
LRM
47
3
0
30 Mar 2025
Previous
1
2
3
...
9
10
11
...
15
16
17
Next