ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.12948
  4. Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Yanling Wang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Zijian Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Han Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Junlong Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
Rongpin Wang
Ruoxin Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
    ReLMVLMOffRLAI4TSLRM
ArXiv (abs)PDFHTML

Papers citing "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"

50 / 1,327 papers shown
Title
Shape and Texture Recognition in Large Vision-Language Models
Shape and Texture Recognition in Large Vision-Language Models
Sagi Eppel
Mor Bismut
Alona Faktor
3DVVLM
114
2
0
29 Mar 2025
Factored Agents: Decoupling In-Context Learning and Memorization for Robust Tool Use
Factored Agents: Decoupling In-Context Learning and Memorization for Robust Tool Use
Nicholas Roth
Christopher Hidey
Lucas Spangher
William Arnold
Chang Ye
Nick Masiewicki
Jinoo Baek
Peter Grabowski
Eugene Ie
LLMAG
141
0
0
29 Mar 2025
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning
Entropy-guided sequence weighting for efficient exploration in RL-based LLM fine-tuning
Abdullah Vanlioglu
152
0
0
28 Mar 2025
PharmAgents: Building a Virtual Pharma with Large Language Model Agents
PharmAgents: Building a Virtual Pharma with Large Language Model Agents
B. Gao
Yanwen Huang
Yiqiao Liu
Wenxuan Xie
Wei-Ying Ma
Ya Zhang
Yanyan Lan
LLMAGLM&Ro
164
3
0
28 Mar 2025
FRASE: Structured Representations for Generalizable SPARQL Query Generation
FRASE: Structured Representations for Generalizable SPARQL Query Generation
Papa Abdou Karim Karou Diallo
Payel Das
79
0
0
28 Mar 2025
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation
REMAC: Self-Reflective and Self-Evolving Multi-Agent Collaboration for Long-Horizon Robot Manipulation
Puzhen Yuan
Angyuan Ma
Yunchao Yao
Huaxiu Yao
Masayoshi Tomizuka
Mingyu Ding
LM&Ro
136
3
0
28 Mar 2025
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
L0-Reasoning Bench: Evaluating Procedural Correctness in Language Models via Simple Program Execution
Simeng Sun
Cheng-Ping Hsieh
Faisal Ladhak
Erik Arakelyan
Santiago Akle Serano
Boris Ginsburg
ReLMELMLRM
509
0
0
28 Mar 2025
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models
Zhanke Zhou
Zhaocheng Zhu
Xuan Li
Mikhail Galkin
Xiao Feng
Sanmi Koyejo
Jian Tang
Bo Han
LRM
181
6
0
28 Mar 2025
Probabilistic Uncertain Reward Model
Probabilistic Uncertain Reward Model
Wangtao Sun
Xiang Cheng
Xing Yu
Haotian Xu
Zhao Yang
Shizhu He
Jun Zhao
Kang Liu
193
0
0
28 Mar 2025
RLDBF: Enhancing LLMs Via Reinforcement Learning With DataBase FeedBack
RLDBF: Enhancing LLMs Via Reinforcement Learning With DataBase FeedBack
Weichen Dai
Zijie Dai
Zhijie Huang
Yixuan Pan
Xinhe Li
Xi Li
Yi Zhou
Ji Qi
Wu Jiang
63
0
0
28 Mar 2025
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Exploring Data Scaling Trends and Effects in Reinforcement Learning from Human Feedback
Wei Shen
Guanlin Liu
Zheng Wu
Ruofei Zhu
Qingping Yang
Chao Xin
Yu Yue
Lin Yan
169
14
0
28 Mar 2025
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning
Weiqi Li
Xinyu Zhang
Shijie Zhao
Yize Zhang
Junlin Li
Li Zhang
Jian Zhang
123
11
0
28 Mar 2025
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks
Weinan Zhang
Mengna Wang
Gangao Liu
Xu Huixin
Yiwei Jiang
...
Hang Zhang
Xin Li
Weiming Lu
Peng Li
Yueting Zhuang
LM&RoLRM
209
9
0
27 Mar 2025
Unlocking the Potential of Past Research: Using Generative AI to Reconstruct Healthcare Simulation Models
Unlocking the Potential of Past Research: Using Generative AI to Reconstruct Healthcare Simulation Models
Thomas Monks
Alison Harper
Amy Heather
94
0
0
27 Mar 2025
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Ivo Petrov
Jasper Dekoninck
Lyuben Baltadzhiev
Maria Drencheva
Kristian Minchev
Mislav Balunović
Nikola Jovanović
Martin Vechev
LRMELM
149
23
0
27 Mar 2025
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis
LeX-Art: Rethinking Text Generation via Scalable High-Quality Data Synthesis
Jike Zhong
Qilong Wu
Xinyue Li
Bo Zhang
Ming Li
...
Haoyang Li
Yu Qiao
Peng Gao
Bin Fu
Zhen Li
EGVM
89
1
0
27 Mar 2025
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models
Chung-En Sun
Ge Yan
Tsui-Wei Weng
KELMLRM
118
3
0
27 Mar 2025
debug-gym: A Text-Based Environment for Interactive Debugging
debug-gym: A Text-Based Environment for Interactive Debugging
Xingdi Yuan
Morgane M Moss
Charbel El Feghali
Chinmay Singh
Darya Moldavskaya
...
Lucas Caccia
Matheus Pereira
Minseon Kim
Alessandro Sordoni
Marc-Alexandre Côté
LLMAG
140
2
0
27 Mar 2025
Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1
Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1
Birger Moëll
Fredrik Sand Aronsson
Sanian Akbar
ELMLRM
72
1
0
27 Mar 2025
Video-R1: Reinforcing Video Reasoning in MLLMs
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng
Kaixiong Gong
Yangqiu Song
Zonghao Guo
Yibing Wang
Tianshuo Peng
Jian Wu
Xiaoying Zhang
Benyou Wang
Xiangyu Yue
AI4TSSyDaLRM
197
62
0
27 Mar 2025
Can Large Language Models Predict Associations Among Human Attitudes?
Can Large Language Models Predict Associations Among Human Attitudes?
Ana Ma
Derek Powell
96
0
0
26 Mar 2025
From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment
From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment
Yucheng Suo
Fan Ma
Linchao Zhu
T. Wang
Fengyun Rao
Yi Yang
LRM
163
0
0
26 Mar 2025
Understanding R1-Zero-Like Training: A Critical Perspective
Understanding R1-Zero-Like Training: A Critical Perspective
Zichen Liu
Changyu Chen
Wenjun Li
Penghui Qi
Tianyu Pang
Chao Du
Wee Sun Lee
Jialin Li
OffRLLRM
248
172
0
26 Mar 2025
A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI
A Large-Scale Vision-Language Dataset Derived from Open Scientific Literature to Advance Biomedical Generalist AI
Alejandro Lozano
Min Woo Sun
James Burgess
Jeffrey Nirschl
Christopher Polzak
...
Xiaohan Wang
Alfred Seunghoon Song
Chiang Chia-Chun
Robert Tibshirani
Serena Yeung-Levy
LM&MA
182
2
0
26 Mar 2025
Open Deep Search: Democratizing Search with Open-source Reasoning Agents
Open Deep Search: Democratizing Search with Open-source Reasoning Agents
Salaheddin Alzubi
Creston Brooks
Purva Chiniya
Edoardo Contente
Chiara von Gerlach
...
Arda Kaz
Windsor Nguyen
Sewoong Oh
Himanshu Tyagi
Pramod Viswanath
VLMELMLRM
196
12
0
26 Mar 2025
A multi-agentic framework for real-time, autonomous freeform metasurface design
A multi-agentic framework for real-time, autonomous freeform metasurface design
Robert Lupoiu
Yixuan Shao
Tianxiang Dai
Chenkai Mao
Kofi Edee
Jonathan A. Fan
AI4CE
110
1
0
26 Mar 2025
Cyborg Data: Merging Human with AI Generated Training Data
Cyborg Data: Merging Human with AI Generated Training Data
Kai North
Christopher Ormerod
72
0
0
26 Mar 2025
Reasoning Beyond Limits: Advances and Open Problems for LLMs
Reasoning Beyond Limits: Advances and Open Problems for LLMs
M. Ferrag
Norbert Tihanyi
Merouane Debbah
ELMOffRLLRMAI4CE
442
4
0
26 Mar 2025
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning
Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning
Huajie Tan
Yuheng Ji
Xiaoshuai Hao
Minglan Lin
Pengwei Wang
Zhongyuan Wang
Shanghang Zhang
ReLMOffRLLRM
224
0
0
26 Mar 2025
Optimal Scaling Laws for Efficiency Gains in a Theoretical Transformer-Augmented Sectional MoE Framework
Optimal Scaling Laws for Efficiency Gains in a Theoretical Transformer-Augmented Sectional MoE Framework
Soham Sane
MoE
102
0
0
26 Mar 2025
RALLRec+: Retrieval Augmented Large Language Model Recommendation with Reasoning
RALLRec+: Retrieval Augmented Large Language Model Recommendation with Reasoning
Sichun Luo
Jian Xu
Xinsong Zhang
Linrong Wang
Sicong Liu
Hanxu Hou
Linqi Song
RALM3DVLRM
124
0
0
26 Mar 2025
RL-finetuning LLMs from on- and off-policy data with a single algorithm
RL-finetuning LLMs from on- and off-policy data with a single algorithm
Yunhao Tang
Taco Cohen
David W. Zhang
Michal Valko
Rémi Munos
OffRL
106
4
0
25 Mar 2025
Innate Reasoning is Not Enough: In-Context Learning Enhances Reasoning Large Language Models with Less Overthinking
Innate Reasoning is Not Enough: In-Context Learning Enhances Reasoning Large Language Models with Less Overthinking
Yuyao Ge
Shenghua Liu
Yansen Wang
Lingrui Mei
Lizhe Chen
Baolong Bi
Xueqi Cheng
ReLMLRM
101
6
0
25 Mar 2025
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yunjie Ji
Yiping Peng
Han Zhao
Xiangang Li
ReLMELMLRM
117
10
0
25 Mar 2025
Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation
Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation
Max W. Y. Lam
Yijin Xing
Weiya You
Jingcheng Wu
Zongyu Yin
...
T. Zhao
Chien-Hung Liu
Xuchen Song
Yang Li
Yahui Zhou
LRM
108
4
0
25 Mar 2025
Iterative Hypothesis Generation for Scientific Discovery with Monte Carlo Nash Equilibrium Self-Refining Trees
Iterative Hypothesis Generation for Scientific Discovery with Monte Carlo Nash Equilibrium Self-Refining Trees
Gollam Rabby
Diyana Muhammed
Prasenjit Mitra
Sören Auer
87
2
0
25 Mar 2025
Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators
Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators
Seungone Kim
Ian Wu
Jinu Lee
Xiang Yue
Seongyun Lee
...
Kiril Gashteovski
Carolin (Haas) Lawrence
Julia Hockenmaier
Graham Neubig
Sean Welleck
LRM
110
5
0
25 Mar 2025
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Long-Context Autoregressive Video Modeling with Next-Frame Prediction
Yuchao Gu
Weijia Mao
Mike Zheng Shou
VGen
185
11
0
25 Mar 2025
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
Hongcheng Gao
Jiashu Qu
Jingyi Tang
Baolong Bi
Yi Liu
Hongyu Chen
Li Liang
Li Su
Qingming Huang
MLLMVLMLRM
179
6
0
25 Mar 2025
LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning?
LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning?
Kexian Tang
Junyao Gao
Yanhong Zeng
Haodong Duan
Yanan Sun
Zhening Xing
Wenran Liu
Kaifeng Lyu
Kai-xiang Chen
ELMLRM
163
9
0
25 Mar 2025
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
Jaihoon Kim
Taehoon Yoon
Jisung Hwang
Minhyuk Sung
DiffM
181
3
0
25 Mar 2025
Will LLMs be Professional at Fund Investment? DeepFund: A Live Arena Perspective
Will LLMs be Professional at Fund Investment? DeepFund: A Live Arena Perspective
Changlun Li
Yao Shi
Yuyu Luo
Nan Tang
AIFin
115
0
0
24 Mar 2025
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
Zhenyu Pan
Han Liu
OffRLLRM
148
7
0
24 Mar 2025
RLCAD: Reinforcement Learning Training Gym for Revolution Involved CAD Command Sequence Generation
RLCAD: Reinforcement Learning Training Gym for Revolution Involved CAD Command Sequence Generation
Xiaolong Yin
Xingyu Lu
Jiahang Shen
Jingzhe Ni
Hailong Li
Ruofeng Tong
Min Tang
Peng Du
3DV
87
2
0
24 Mar 2025
Video-T1: Test-Time Scaling for Video Generation
Video-T1: Test-Time Scaling for Video Generation
Fan Liu
Hanyang Wang
Yimo Cai
Kaiyan Zhang
Xiaohang Zhan
Yueqi Duan
DiffMVGen
160
7
0
24 Mar 2025
Evolutionary Policy Optimization
Evolutionary Policy Optimization
Jianren Wang
Yifan Su
Abhinav Gupta
Deepak Pathak
92
0
0
24 Mar 2025
Language Model Uncertainty Quantification with Attention Chain
Language Model Uncertainty Quantification with Attention Chain
Yinghao Li
Rushi Qiang
Lama Moukheiber
Chao Zhang
LRM
95
3
0
24 Mar 2025
Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilities
Trade-offs in Large Reasoning Models: An Empirical Analysis of Deliberative and Adaptive Reasoning over Foundational Capabilities
Weixiang Zhao
Xingyu Sui
Jiahe Guo
Yulin Hu
Yang Deng
Yanyan Zhao
Bing Qin
Wanxiang Che
Tat-Seng Chua
Ting Liu
ELMLRM
132
9
0
23 Mar 2025
AgentRxiv: Towards Collaborative Autonomous Research
AgentRxiv: Towards Collaborative Autonomous Research
Samuel Schmidgall
Michael Moor
191
8
0
23 Mar 2025
Mind with Eyes: from Language Reasoning to Multimodal Reasoning
Mind with Eyes: from Language Reasoning to Multimodal Reasoning
Zhiyu Lin
Yifei Gao
Xian Zhao
Yunfan Yang
Jitao Sang
LRM
157
5
0
23 Mar 2025
Previous
123...202122...252627
Next