ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.12948
  4. Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Xiaokang Zhang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Z. Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Hairu Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Junlong Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
R. Wang
Renqi Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
    ReLM
    VLM
    OffRL
    AI4TS
    LRM
ArXivPDFHTML

Papers citing "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"

50 / 812 papers shown
Title
From Principles to Applications: A Comprehensive Survey of Discrete Tokenizers in Generation, Comprehension, Recommendation, and Information Retrieval
From Principles to Applications: A Comprehensive Survey of Discrete Tokenizers in Generation, Comprehension, Recommendation, and Information Retrieval
Jian Jia
Jingtong Gao
Ben Xue
Junhao Wang
Qingpeng Cai
Quan Chen
Xiangyu Zhao
Peng Jiang
Kun Gai
OffRL
77
0
0
18 Feb 2025
LAMD: Context-driven Android Malware Detection and Classification with LLMs
LAMD: Context-driven Android Malware Detection and Classification with LLMs
Xingzhi Qian
Xinran Zheng
Yiling He
Shuo Yang
Lorenzo Cavallaro
83
2
0
18 Feb 2025
RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents
RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents
Weizhe Chen
Sven Koenig
B. Dilkina
LLMAG
112
8
0
17 Feb 2025
Locally-Deployed Chain-of-Thought (CoT) Reasoning Model in Chemical Engineering: Starting from 30 Experimental Data
Locally-Deployed Chain-of-Thought (CoT) Reasoning Model in Chemical Engineering: Starting from 30 Experimental Data
Tianhang Zhou
Yingchun Niu
Xingying Lan
Chunming Xu
LRM
49
0
0
17 Feb 2025
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
57
3
0
17 Feb 2025
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models
Sherzod Hakimov
Lara Pfennigschmidt
David Schlangen
ELM
57
0
0
17 Feb 2025
A Survey of Personalized Large Language Models: Progress and Future Directions
A Survey of Personalized Large Language Models: Progress and Future Directions
Jiahong Liu
Zexuan Qiu
Zhongyang Li
Quanyu Dai
Jieming Zhu
Minda Hu
Menglin Yang
Irwin King
LM&MA
58
4
0
17 Feb 2025
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hyunwoo Kim
Melanie Sclar
Tan Zhi-Xuan
Lance Ying
Sydney Levine
Yang Liu
Joshua B. Tenenbaum
Yejin Choi
LRM
LLMAG
61
0
0
17 Feb 2025
Quantifying the Capability Boundary of DeepSeek Models: An Application-Driven Performance Analysis
Quantifying the Capability Boundary of DeepSeek Models: An Application-Driven Performance Analysis
Kaikai Zhao
Zhaoxiang Liu
Xuejiao Lei
Rongjia Du
Zhenhong Long
...
Minjie Hua
Kai Wang
Wen Liu
Ning Wang
Kai Wang
ELM
LRM
60
1
0
16 Feb 2025
PEA: Enhancing LLM Performance on Computational-Reasoning Tasks
PEA: Enhancing LLM Performance on Computational-Reasoning Tasks
Zi Wang
Shiwei Weng
Mohannad J. Alhanahnah
S. Jha
Tom Reps
LRM
ReLM
46
0
0
16 Feb 2025
Safety Evaluation of DeepSeek Models in Chinese Contexts
Safety Evaluation of DeepSeek Models in Chinese Contexts
Wenjing Zhang
Xuejiao Lei
Zhaoxiang Liu
Rongjia Du
Zhenhong Long
...
Jiaojiao Zhao
Minjie Hua
Chaoyang Ma
Kai Wang
Kai Wang
ELM
129
8
0
16 Feb 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Zihuiwen Ye
Luckeciano C. Melo
Younesse Kaddar
Phil Blunsom
Shivalika Singh
Yarin Gal
LRM
49
2
0
16 Feb 2025
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models
Haoyang Li
Xuejia Chen
Zhanchao Xu
Darian Li
Nicole Hu
...
Heng Chang
Luyu Qiu
C. Zhang
Qing Li
Lei Chen
LRM
ELM
48
1
0
16 Feb 2025
To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models
To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models
Zihao Zhu
Hongbao Zhang
Ruotong Wang
Ke Xu
Siwei Lyu
Baoyuan Wu
AAML
LRM
67
5
0
16 Feb 2025
Typhoon T1: An Open Thai Reasoning Model
Typhoon T1: An Open Thai Reasoning Model
Pittawat Taveekitworachai
Potsawee Manakul
Kasima Tharnpipitchai
Kunat Pipatanakul
OffRL
LRM
102
0
0
13 Feb 2025
DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via Representation Vulnerabilities
DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via Representation Vulnerabilities
Chashi Mahiul Islam
Samuel Jacob Chacko
Preston Horne
Xiuwen Liu
110
1
0
11 Feb 2025
Trustworthy AI on Safety, Bias, and Privacy: A Survey
Trustworthy AI on Safety, Bias, and Privacy: A Survey
Xingli Fang
Jianwei Li
Varun Mulchandani
Jung-Eun Kim
45
0
0
11 Feb 2025
Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring
Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring
Alex Heyman
Joel Zylberberg
LRM
50
1
0
10 Feb 2025
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
L. Yang
Zhaochen Yu
Bin Cui
Mengdi Wang
ReLM
LRM
AI4CE
101
12
0
10 Feb 2025
VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data
Thomas Zeng
Shuibai Zhang
Shutong Wu
Christian Classen
Daewon Chae
...
Jungtaek Kim
H. Koo
Kannan Ramchandran
Dimitris Papailiopoulos
Kangwook Lee
LRM
77
3
0
10 Feb 2025
Digital Twin Buildings: 3D Modeling, GIS Integration, and Visual Descriptions Using Gaussian Splatting, ChatGPT/Deepseek, and Google Maps Platform
Digital Twin Buildings: 3D Modeling, GIS Integration, and Visual Descriptions Using Gaussian Splatting, ChatGPT/Deepseek, and Google Maps Platform
K. Gao
Dening Lu
Liangzhi Li
Nan Chen
Hongjie He
Linlin Xu
Jonathan Li
3DGS
3DPC
AI4CE
63
1
0
09 Feb 2025
Enhancing Depression Detection with Chain-of-Thought Prompting: From Emotion to Reasoning Using Large Language Models
Enhancing Depression Detection with Chain-of-Thought Prompting: From Emotion to Reasoning Using Large Language Models
Shiyu Teng
Jiaqing Liu
R. Jain
Shurong Chai
Ruibo Hou
T. Tateyama
Lanfen Lin
Yuxiao Chen
AI4MH
LRM
63
1
0
09 Feb 2025
Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning
Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning
Libo Wang
LRM
216
1
0
07 Feb 2025
LLMs Can Teach Themselves to Better Predict the Future
LLMs Can Teach Themselves to Better Predict the Future
Benjamin Turtel
Danny Franklin
Philipp Schoenegger
LRM
62
0
0
07 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
254
0
0
04 Feb 2025
Brief analysis of DeepSeek R1 and its implications for Generative AI
Brief analysis of DeepSeek R1 and its implications for Generative AI
Sarah Mercer
Samuel Spillard
Daniel P. Martin
79
13
0
04 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs
OverThink: Slowdown Attacks on Reasoning LLMs
A. Kumar
Jaechul Roh
A. Naseh
Marzena Karpinska
Mohit Iyyer
Amir Houmansadr
Eugene Bagdasarian
LRM
66
17
0
04 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
251
4
0
04 Feb 2025
What is a Number, That a Large Language Model May Know It?
What is a Number, That a Large Language Model May Know It?
Raja Marjieh
Veniamin Veselovsky
Thomas Griffiths
Ilia Sucholutsky
239
2
0
03 Feb 2025
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Consensus Enforcement, and Column Exploration
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Consensus Enforcement, and Column Exploration
Minghang Deng
Ashwin Ramachandran
Canwen Xu
Lanxiang Hu
Zhewei Yao
Anupam Datta
Hao Zhang
LMTD
139
1
0
02 Feb 2025
Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation
Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation
Bin Zhu
Hui yan Qi
Yinxuan Gui
Jingjing Chen
Chong-Wah Ngo
Ee-Peng Lim
190
1
0
31 Jan 2025
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun Xia
Tianyi Wu
Zhiwei Xue
Yuxiao Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TS
LRM
136
16
0
30 Jan 2025
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu
Yuexiang Zhai
Jihan Yang
Shengbang Tong
Saining Xie
Dale Schuurmans
Quoc V. Le
Sergey Levine
Yi Ma
OffRL
70
67
0
28 Jan 2025
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer
Lin Yueyu
Li Zhiyuan
Peter Yue
Liu Xiao
42
5
0
28 Jan 2025
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Samira Abnar
Harshay Shah
Dan Busbridge
Alaaeldin Mohamed Elnouby Ali
J. Susskind
Vimal Thilak
MoE
LRM
49
5
0
28 Jan 2025
CodeMonkeys: Scaling Test-Time Compute for Software Engineering
CodeMonkeys: Scaling Test-Time Compute for Software Engineering
Ryan Ehrlich
Bradley Brown
Jordan Juravsky
Ronald Clark
Christopher Ré
Azalia Mirhoseini
57
8
0
24 Jan 2025
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Xingyu Chen
Jiahao Xu
Tian Liang
Zhiwei He
Jianhui Pang
...
Zizhuo Zhang
Rui Wang
Zhaopeng Tu
Haitao Mi
Dong Yu
LRM
ReLM
61
113
0
30 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
134
9
0
19 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
91
1
0
11 Dec 2024
Unifying KV Cache Compression for Large Language Models with LeanKV
Unifying KV Cache Compression for Large Language Models with LeanKV
Yanqi Zhang
Yuwei Hu
Runyuan Zhao
John C. S. Lui
Haibo Chen
MQ
151
5
0
04 Dec 2024
Enhancing Answer Reliability Through Inter-Model Consensus of Large Language Models
Enhancing Answer Reliability Through Inter-Model Consensus of Large Language Models
Alireza Amiri-Margavi
Iman Jebellat
Ehsan Jebellat
Seyed Pouyan Mousavi Davoudi
99
2
0
25 Nov 2024
DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models
DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models
Yongdong Wang
Runze Xiao
Jun Younes Louhi Kasahara
Ryosuke Yajima
Keiji Nagatani
Atsushi Yamashita
Hajime Asama
39
3
0
13 Nov 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
52
3
0
24 Oct 2024
Understanding Layer Significance in LLM Alignment
Understanding Layer Significance in LLM Alignment
Guangyuan Shi
Zexin Lu
Xiaoyu Dong
Wenlong Zhang
Xuanyu Zhang
Yujie Feng
Xiao-Ming Wu
58
2
0
23 Oct 2024
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
Jialu Tang
Tong Xia
Yuan Lu
Cecilia Mascolo
Aaqib Saeed
AI4MH
59
2
0
18 Oct 2024
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang
Pei Zhang
Baosong Yang
Derek F. Wong
Rui-cang Wang
LRM
56
5
0
17 Oct 2024
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Seiji Maekawa
Hayate Iso
Nikita Bhutani
RALM
110
1
0
15 Oct 2024
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Rushang Karia
Daniel Bramblett
D. Dobhal
Siddharth Srivastava
ELM
LRM
35
0
0
11 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
58
9
0
10 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
39
3
0
08 Oct 2024
Previous
123...151617
Next