ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.12948
  4. Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Yanling Wang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Zijian Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Han Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Junlong Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
Rongpin Wang
Ruoxin Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
    ReLMVLMOffRLAI4TSLRM
ArXiv (abs)PDFHTML

Papers citing "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"

27 / 1,327 papers shown
Title
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
Zhaopeng Tu
VLM
270
0
0
21 Nov 2024
DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models
DART-LLM: Dependency-Aware Multi-Robot Task Decomposition and Execution using Large Language Models
Yongdong Wang
Runze Xiao
Jun Younes Louhi Kasahara
Ryosuke Yajima
Keiji Nagatani
Atsushi Yamashita
Hajime Asama
113
7
0
13 Nov 2024
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies
Liwen Wang
Sheng Chen
Linnan Jiang
Shu Pan
Runze Cai
Sen Yang
Fei Yang
197
7
0
24 Oct 2024
Understanding Layer Significance in LLM Alignment
Understanding Layer Significance in LLM Alignment
Guangyuan Shi
Zexin Lu
Xiaoyu Dong
Wenlong Zhang
Xuanyu Zhang
Yujie Feng
Xiao-Ming Wu
151
3
0
23 Oct 2024
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
Jialu Tang
Tong Xia
Yuan Lu
Cecilia Mascolo
Aaqib Saeed
AI4MH
102
3
0
18 Oct 2024
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation
Yiming Wang
Pei Zhang
Baosong Yang
Derek F. Wong
Rui Wang
LRM
119
15
0
17 Oct 2024
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data
Seiji Maekawa
Hayate Iso
Nikita Bhutani
RALM
218
2
0
15 Oct 2024
TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs
TMGBench: A Systematic Game Benchmark for Evaluating Strategic Reasoning Abilities of LLMs
Haoran Wang
Xiachong Feng
Lei Li
Zhan Qin
Dianbo Sui
Dianbo Sui
Lingpeng Kong
LRMELM
117
7
0
14 Oct 2024
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
Rushang Karia
Daniel Bramblett
D. Dobhal
Siddharth Srivastava
ELMLRM
117
0
0
11 Oct 2024
The Geometry of Concepts: Sparse Autoencoder Feature Structure
The Geometry of Concepts: Sparse Autoencoder Feature Structure
Yuxiao Li
Eric J. Michaud
David D. Baek
Joshua Engels
Xiaoqing Sun
Max Tegmark
122
21
0
10 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
131
4
0
08 Oct 2024
Differential Transformer
Differential Transformer
Tianzhu Ye
Li Dong
Yuqing Xia
Yutao Sun
Yi Zhu
Gao Huang
Furu Wei
526
0
0
07 Oct 2024
Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling
Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling
Jinghan Li
Zhicheng Sun
Fei Li
267
2
0
02 Oct 2024
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective
Yotam Wolf
Binyamin Rothberg
Dorin Shteyman
Amnon Shashua
106
0
0
26 Sep 2024
Deploying Open-Source Large Language Models: A performance Analysis
Deploying Open-Source Large Language Models: A performance Analysis
Yannis Bendi-Ouis
Dan Dutarte
Xavier Hinaut
ALM
57
2
0
23 Sep 2024
InteLiPlan: An Interactive Lightweight LLM-Based Planner for Domestic Robot Autonomy
InteLiPlan: An Interactive Lightweight LLM-Based Planner for Domestic Robot Autonomy
Kim Tien Ly
Kai Lu
Ioannis Havoutis
95
2
0
22 Sep 2024
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models
AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models
Yifei Yao
Wentao He
Chenyu Gu
Jiaheng Du
Fuwei Tan
Zhen Zhu
Junguo Lu
OffRL
131
2
0
13 Sep 2024
A Comprehensive Review of Few-shot Action Recognition
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
181
4
0
20 Jul 2024
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
179
403
0
06 Apr 2024
A Survey on Large Language Model-Based Game Agents
A Survey on Large Language Model-Based Game Agents
Sihao Hu
Tiansheng Huang
Gaowen Liu
Ramana Rao Kompella
Gaowen Liu
Selim Furkan Tekin
Yichang Xu
Zachary Yahn
Ling Liu
LLMAGLM&RoAI4CELM&MA
231
58
0
02 Apr 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Xupeng Miao
Gabriele Oliaro
Xinhao Cheng
Vineeth Kada
Ruohan Gao
...
April Yang
Yingcheng Wang
Mengdi Wu
Colin Unger
Zhihao Jia
MoE
192
11
0
29 Feb 2024
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
FormulaReasoning: A Dataset for Formula-Based Numerical Reasoning
Xiao Li
Bolin Zhu
Kaiwen Shi
Sichen Liu
Yin Zhu
Yiwei Liu
Gong Cheng
AIMat
108
1
0
20 Feb 2024
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability
Siwei Yang
Bingchen Zhao
Cihang Xie
LRM
66
6
0
14 Feb 2024
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning
Yuqiang Sun
Daoyuan Wu
Yue Xue
Han Liu
Wei Ma
Lyuye Zhang
Miaolei Shi
Yingjiu Li
ELM
211
55
0
29 Jan 2024
GLoRE: Evaluating Logical Reasoning of Large Language Models
GLoRE: Evaluating Logical Reasoning of Large Language Models
Hanmeng Liu
Zhiyang Teng
Ruoxi Ning
Jian Liu
Qiji Zhou
Yuexin Zhang
Yue Zhang
ReLMELMLRM
175
8
0
13 Oct 2023
Evaluating the Capability of Large-scale Language Models on Chinese Grammatical Error Correction Task
Evaluating the Capability of Large-scale Language Models on Chinese Grammatical Error Correction Task
Fanyi Qu
Hao Sun
Yunfang Wu
ELMLRM
133
7
0
08 Jul 2023
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Tensor Networks Meet Neural Networks: A Survey and Future Perspectives
Maolin Wang
Yu Pan
Zenglin Xu
Xiangli Yang
Guangxi Li
A. Cichocki
Andrzej Cichocki
212
22
0
22 Jan 2023
Previous
123...252627