ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.12948
  4. Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Yanling Wang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Zijian Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Han Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Junlong Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
Rongpin Wang
Ruoxin Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
    ReLMVLMOffRLAI4TSLRM
ArXiv (abs)PDFHTML

Papers citing "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"

50 / 1,327 papers shown
Title
Emerging Cyber Attack Risks of Medical AI Agents
Emerging Cyber Attack Risks of Medical AI Agents
Jianing Qiu
Lin Li
Jiankai Sun
Hao Wei
Zhe Xu
K. Lam
Wu Yuan
AAML
120
3
0
02 Apr 2025
Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish
Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish
Cedric Lothritz
Jordi Cabot
83
0
0
02 Apr 2025
Watermarking for AI Content Detection: A Review on Text, Visual, and Audio Modalities
Watermarking for AI Content Detection: A Review on Text, Visual, and Audio Modalities
Lele Cao
89
1
0
02 Apr 2025
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
Nan Zhang
Yusen Zhang
Prasenjit Mitra
Rui Zhang
MQLRM
183
4
0
02 Apr 2025
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Yi-Long Lu
Chunhui Zhang
Jiajun Song
Lifeng Fan
Wei Wang
OffRL
124
0
0
02 Apr 2025
RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics
RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics
Zhiyuan Zhang
Yuxin He
Yong Sun
Junyu Shi
Lijiang Liu
Qiang Nie
VLM
114
0
0
02 Apr 2025
Chain of Correction for Full-text Speech Recognition with Large Language Models
Chain of Correction for Full-text Speech Recognition with Large Language Models
Zhiyuan Tang
Dong Wang
Zhikai Zhou
Y. Liu
Shen Huang
Siyang Song
KELM
113
0
0
02 Apr 2025
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding
Sakhinana Sagar Srinivas
Akash Das
Shivam Gupta
Venkataramana Runkana
OffRL
132
1
0
02 Apr 2025
ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning
ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning
Bairu Hou
Yang Zhang
Jiabao Ji
Yujian Liu
Kaizhi Qian
Jacob Andreas
Shiyu Chang
OffRLLRM
131
35
0
02 Apr 2025
Scaling Test-time Compute for Low-resource Languages: Multilingual Reasoning in LLMs
Scaling Test-time Compute for Low-resource Languages: Multilingual Reasoning in LLMs
Khanh-Tung Tran
Barry O'Sullivan
Hoang D. Nguyen
LRM
131
2
0
02 Apr 2025
Reasoning LLMs for User-Aware Multimodal Conversational Agents
Reasoning LLMs for User-Aware Multimodal Conversational Agents
Hamed Rahimi
Jeanne Cattoni
Meriem Beghili
Mouad Abrini
Mahdi Khoramshahi
Maribel Pino
Mohamed Chetouani
LRM
93
2
0
02 Apr 2025
Improved Visual-Spatial Reasoning via R1-Zero-Like Training
Improved Visual-Spatial Reasoning via R1-Zero-Like Training
Zhenyi Liao
Qingsong Xie
Yanhao Zhang
Zijian Kong
Haonan Lu
Zhenyu Yang
Zhijie Deng
ReLMVLMLRM
205
11
1
01 Apr 2025
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
Juncheng Wu
Wenlong Deng
Xiaochen Li
Sheng Liu
Taomian Mi
...
Yihan Cao
Hui Ren
Xuzhao Li
Xiaoxiao Li
Yuyin Zhou
AI4MHLRM
146
16
0
01 Apr 2025
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning
Nishad Singhi
Hritik Bansal
Arian Hosseini
Aditya Grover
Kai-Wei Chang
Marcus Rohrbach
Anna Rohrbach
OffRLLRM
140
6
0
01 Apr 2025
VNJPTranslate: A comprehensive pipeline for Vietnamese-Japanese translation
VNJPTranslate: A comprehensive pipeline for Vietnamese-Japanese translation
Hoang Hai Phan
Nguyen Duc Minh Vu
Nam Dang Phuong
73
0
0
01 Apr 2025
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Kai Yan
Yufei Xu
Zhengyin Du
Xuesong Yao
Ziyi Wang
Xiaowen Guo
Jiecao Chen
ReLMELMLRM
222
5
0
01 Apr 2025
Hawkeye:Efficient Reasoning with Model Collaboration
Hawkeye:Efficient Reasoning with Model Collaboration
Jianshu She
Z. Li
Zhemin Huang
Qi Li
Peiran Xu
Haonan Li
Qirong Ho
LRM
179
4
0
01 Apr 2025
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
Jian Zhao
Runze Liu
Kaiyan Zhang
Zhimu Zhou
Junqi Gao
...
Jiafei Lyu
Zhouyi Qian
Biqing Qi
Xiu Li
Bowen Zhou
OffRLLRM
137
13
0
01 Apr 2025
Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion
Short-PHD: Detecting Short LLM-generated Text with Topological Data Analysis After Off-topic Content Insertion
Dongjun Wei
Minjia Mao
Xiao Fang
Michael Chau
DeLMO
102
1
0
01 Apr 2025
Neural Approaches to SAT Solving: Design Choices and Interpretability
Neural Approaches to SAT Solving: Design Choices and Interpretability
David Mojžíšek
Jan Hůla
Ziwei Li
Ziyu Zhou
Mikoláš Janota
AAMLNAI
73
0
0
01 Apr 2025
Z1: Efficient Test-time Scaling with Code
Z1: Efficient Test-time Scaling with Code
Zhaojian Yu
Yinghao Wu
Yilun Zhao
Arman Cohan
Xiao-Ping Zhang
LRM
125
14
0
01 Apr 2025
GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments
Enjun Du
Miao Hu
Tian Jin
Zhihan Zhang
Rong-Hua Li
Guoren Wang
131
4
0
01 Apr 2025
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
Jianhao Chen
Zishuo Xun
Bocheng Zhou
Han Qi
Qiaosheng Zhang
...
Wei Hu
Yuzhong Qu
W. Ouyang
Wanli Ouyang
Shuyue Hu
210
2
0
01 Apr 2025
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench
Ziyi Liu
Priyanka Dey
Zhenyu Zhao
Jen-tse Huang
Rahul Gupta
Yang Liu
Jieyu Zhao
102
2
0
01 Apr 2025
Boosting MLLM Reasoning with Text-Debiased Hint-GRPO
Boosting MLLM Reasoning with Text-Debiased Hint-GRPO
Qihan Huang
Long Chan
Jinlong Liu
Wanggui He
Hao Jiang
Mingli Song
Jingyuan Chen
Chang Yao
Jie Song
LRM
87
4
0
31 Mar 2025
TuRTLe: A Unified Evaluation of LLMs for RTL Generation
TuRTLe: A Unified Evaluation of LLMs for RTL Generation
Dario Garcia-Gasulla
Gokcen Kestor
Emanuele Parisi
Miquel Albertí-Binimelis
Cristian Gutierrez
Razine Moundir Ghorab
Orlando Montenegro
Bernat Homs
Miquel Moreto
123
1
0
31 Mar 2025
Can Test-Time Scaling Improve World Foundation Model?
Can Test-Time Scaling Improve World Foundation Model?
Wenyan Cong
Hanqing Zhu
Peihao Wang
Bangya Liu
Dejia Xu
Kevin Wang
David Z. Pan
Yan Wang
Zhiwen Fan
Ziyi Wang
148
1
0
31 Mar 2025
CrowdVLM-R1: Expanding R1 Ability to Vision Language Model for Crowd Counting using Fuzzy Group Relative Policy Reward
CrowdVLM-R1: Expanding R1 Ability to Vision Language Model for Crowd Counting using Fuzzy Group Relative Policy Reward
Zhiqiang Wang
Pengbin Feng
Yanbin Lin
Shuzhang Cai
Zongao Bian
Jinghua Yan
Xingquan Zhu
94
4
0
31 Mar 2025
Model Hemorrhage and the Robustness Limits of Large Language Models
Model Hemorrhage and the Robustness Limits of Large Language Models
Ziyang Ma
Hui Yuan
Lefei Zhang
Gui-Song Xia
Bo Du
Liangpei Zhang
Dacheng Tao
129
1
0
31 Mar 2025
Adaptive Layer-skipping in Pre-trained LLMs
Adaptive Layer-skipping in Pre-trained LLMs
Xuan Luo
Weizhi Wang
Xifeng Yan
474
1
0
31 Mar 2025
SQuat: Subspace-orthogonal KV Cache Quantization
SQuat: Subspace-orthogonal KV Cache Quantization
Hao Wang
Ligong Han
Kai Xu
Akash Srivastava
MQ
121
1
0
31 Mar 2025
HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment
HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment
Zhichao Liao
Xiaokun Liu
Wenyu Qin
Qingyu Li
Qiulin Wang
Pengfei Wan
Di Zhang
Long Zeng
Pingfa Feng
207
1
0
31 Mar 2025
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Yi Chen
Yuying Ge
Rui Wang
Yixiao Ge
Lu Qiu
Ying Shan
Xihui Liu
ReLMVLMOffRLLRM
122
10
0
31 Mar 2025
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead
Vidhisha Balachandran
Jingya Chen
Lingjiao Chen
Shivam Garg
Neel Joshi
...
John Langford
Besmira Nushi
Vibhav Vineet
Yue Wu
Safoora Yousefi
ReLMLRM
195
8
0
31 Mar 2025
JudgeLRM: Large Reasoning Models as a Judge
JudgeLRM: Large Reasoning Models as a Judge
Nuo Chen
Zhiyuan Hu
Qingyun Zou
Jiaying Wu
Qian Wang
Bryan Hooi
Bingsheng He
ReLMELMLRM
194
15
0
31 Mar 2025
Large Language Models in Numberland: A Quick Test of Their Numerical Reasoning Abilities
Large Language Models in Numberland: A Quick Test of Their Numerical Reasoning Abilities
Roussel Rahman
ReLMELMLRM
100
1
0
31 Mar 2025
LLM4FS: Leveraging Large Language Models for Feature Selection and How to Improve It
LLM4FS: Leveraging Large Language Models for Feature Selection and How to Improve It
Jianhao Li
Xianchao Xiu
111
0
0
31 Mar 2025
Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning
Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement Learning
J. Lin
Tian Wang
Kun Qian
LRM
152
7
0
31 Mar 2025
RARE: Retrieval-Augmented Reasoning Modeling
RARE: Retrieval-Augmented Reasoning Modeling
Zhengren Wang
Jiayang Yu
Dongsheng Ma
Zhe Chen
Yu Wang
...
Feiyu Xiong
Yanfeng Wang
Weinan E
Linpeng Tang
Wentao Zhang
RALMLRM
130
3
0
30 Mar 2025
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation
Jixuan Leng
Chengsong Huang
Langlin Huang
Bill Yuchen Lin
William W. Cohen
Haohan Wang
Jiaxin Huang
LRM
174
1
0
30 Mar 2025
Evolutionary Prompt Optimization Discovers Emergent Multimodal Reasoning Strategies in Vision-Language Models
Evolutionary Prompt Optimization Discovers Emergent Multimodal Reasoning Strategies in Vision-Language Models
Sid Bharthulwar
John Rho
Katrina Brown
ReLMVLMLRM
100
0
0
30 Mar 2025
A Survey on Unlearnable Data
A Survey on Unlearnable Data
Jiahao Li
Yiqiang Chen
Yunbing Xing
Yang Gu
Xiangyuan Lan
AAML
120
0
0
30 Mar 2025
From Panels to Prose: Generating Literary Narratives from Comics
From Panels to Prose: Generating Literary Narratives from Comics
Ragav Sachdeva
Andrew Zisserman
112
1
0
30 Mar 2025
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
Sanjoy Chowdhury
Hanan Gani
Nishit Anand
Sayan Nag
Ruohan Gao
Mohamed Elhoseiny
Salman Khan
Dinesh Manocha
LRM
191
1
0
29 Mar 2025
DAT: Dynamic Alpha Tuning for Hybrid Retrieval in Retrieval-Augmented Generation
DAT: Dynamic Alpha Tuning for Hybrid Retrieval in Retrieval-Augmented Generation
Hsin-Ling Hsu
Jengnan Tzeng
62
1
0
29 Mar 2025
A Retrieval-Augmented Knowledge Mining Method with Deep Thinking LLMs for Biomedical Research and Clinical Support
A Retrieval-Augmented Knowledge Mining Method with Deep Thinking LLMs for Biomedical Research and Clinical Support
Yichun Feng
Jiawei Wang
Ruikun He
Lu Zhou
Yixue Li
RALM
112
1
0
29 Mar 2025
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?
Tuo Liang
Zhe Hu
Jing Li
Hao Zhang
Yiren Lu
...
Yiran Qiao
Disheng Liu
Jeirui Peng
Jing Ma
Yu Yin
142
0
0
29 Mar 2025
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL
Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL
Mohammadreza Pourreza
Shayan Talaei
Ruoxi Sun
Xingchen Wan
Hailong Li
Azalia Mirhoseini
Amin Saberi
Sercan O. Arik
ReLMAI4TSLRM
157
12
0
29 Mar 2025
Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge
Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge
Senkang Hu
Yanan Ma
Yihang Tao
Zhengru Fang
Zihan Fang
Yiqin Deng
Sam Kwong
Yuguang Fang
80
0
0
29 Mar 2025
Efficient Inference for Large Reasoning Models: A Survey
Efficient Inference for Large Reasoning Models: A Survey
Yi Liu
Jiaying Wu
Yufei He
Hongcheng Gao
Hongyu Chen
Baolong Bi
Jiaheng Zhang
Zhiqi Huang
Bryan Hooi
Bryan Hooi
LLMAGLRM
179
17
0
29 Mar 2025
Previous
123...192021...252627
Next