Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.12948
Cited By
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
22 January 2025
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
Ruoyu Zhang
Ran Xu
Qihao Zhu
Shirong Ma
P. Wang
Xiao Bi
Yanling Wang
X. Yu
Yu-Huan Wu
Z. F. Wu
Zhibin Gou
Z. Shao
Zhuoshu Li
Zijian Gao
Aixin Liu
Bing Xue
Bingxuan Wang
Bochao Wu
B. Feng
Chengda Lu
Chenggang Zhao
Chengqi Deng
Chenyi Zhang
Chong Ruan
Damai Dai
Deli Chen
Dongjie Ji
Erhang Li
F. Lin
Fucong Dai
Fuli Luo
Guangbo Hao
Guanting Chen
Guozhang Li
Han Zhang
Han Bao
Hanwei Xu
Han Wang
Honghui Ding
Huajian Xin
Huazuo Gao
Hui Qu
Hui Li
Jianzhong Guo
Jiashi Li
Jiawei Wang
Jianfei Chen
Jingyang Yuan
Junjie Qiu
Junlong Li
Jianfeng Cai
Jiaqi Ni
Jian Liang
Jin Chen
Kai Dong
Kai Hu
Kaige Gao
Kang Guan
Kexin Huang
Kuai Yu
Lean Wang
Lecong Zhang
Liang Zhao
L. Wang
Liyue Zhang
Lei Xu
Leyi Xia
Mingchuan Zhang
Minghua Zhang
Minghui Tang
Meng Li
Miaojun Wang
Mingming Li
Ning Tian
Panpan Huang
Peng Zhang
Qian Wang
Qinyu Chen
Qiushi Du
Ruiqi Ge
Ruisong Zhang
Ruizhe Pan
Rongpin Wang
Ruoxin Chen
Rong Jin
Ruyi Chen
Shanghao Lu
Shangyan Zhou
Tian Jin
Shengfeng Ye
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning"
50 / 1,327 papers shown
Title
Language Models Can Predict Their Own Behavior
Dhananjay Ashok
Jonathan May
ReLM
AI4TS
LRM
137
2
0
18 Feb 2025
Baichuan-M1: Pushing the Medical Capability of Large Language Models
Binghai Wang
Haizhou Zhao
Huozhi Zhou
Liang Song
Mingyu Xu
...
Yan Zhang
Yifei Duan
Yuyan Zhou
Zhi-Ming Ma
Zhikai Wu
LM&MA
ELM
AI4MH
123
10
0
18 Feb 2025
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Weizhe Yuan
Jane Dwivedi-Yu
Song Jiang
Karthik Padthe
Yang Li
...
Ilia Kulikov
Kyunghyun Cho
Yuandong Tian
Jason Weston
Xian Li
ReLM
LRM
180
20
0
18 Feb 2025
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Yongtao Wu
Luca Viano
Yihang Chen
Zhenyu Zhu
Kimon Antonakopoulos
Quanquan Gu
Volkan Cevher
187
1
0
18 Feb 2025
LAMD: Context-driven Android Malware Detection and Classification with LLMs
Xingzhi Qian
Xinran Zheng
Yiling He
Shuo Yang
Lorenzo Cavallaro
176
4
0
18 Feb 2025
From Principles to Applications: A Comprehensive Survey of Discrete Tokenizers in Generation, Comprehension, Recommendation, and Information Retrieval
Jian Jia
Jingtong Gao
Ben Xue
Junhao Wang
Qingpeng Cai
Quan Chen
Xiangyu Zhao
Peng Jiang
Kun Gai
OffRL
151
2
0
18 Feb 2025
Locally-Deployed Chain-of-Thought (CoT) Reasoning Model in Chemical Engineering: Starting from 30 Experimental Data
Tianhang Zhou
Yingchun Niu
Xingying Lan
Chunming Xu
LRM
106
0
0
17 Feb 2025
A-MEM: Agentic Memory for LLM Agents
Wujiang Xu
Zujie Liang
Kai Mei
Hang Gao
Juntao Tan
Yongfeng Zhang
LLMAG
KELM
RALM
342
28
0
17 Feb 2025
A Survey of Personalized Large Language Models: Progress and Future Directions
Jiahong Liu
Zexuan Qiu
Zhongyang Li
Quanyu Dai
Jieming Zhu
Minda Hu
Menglin Yang
Irwin King
LM&MA
108
9
0
17 Feb 2025
MathGAP: Out-of-Distribution Evaluation on Problems with Arbitrarily Complex Proofs
Andreas Opedal
Haruki Shirakami
Bernhard Schölkopf
Abulhair Saparov
Mrinmaya Sachan
LRM
154
3
0
17 Feb 2025
Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models
Sherzod Hakimov
Lara Pfennigschmidt
David Schlangen
ELM
148
0
0
17 Feb 2025
RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents
Weizhe Chen
Sven Koenig
B. Dilkina
LLMAG
214
12
0
17 Feb 2025
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models
Hyunwoo Kim
Melanie Sclar
Tan Zhi-Xuan
Lance Ying
Sydney Levine
Yang Liu
Joshua B. Tenenbaum
Yejin Choi
LRM
LLMAG
119
3
0
17 Feb 2025
Uncertainty-Aware Step-wise Verification with Generative Reward Models
Zihuiwen Ye
Luckeciano C. Melo
Younesse Kaddar
Phil Blunsom
Shivalika Singh
Yarin Gal
LRM
149
5
0
16 Feb 2025
Quantifying the Capability Boundary of DeepSeek Models: An Application-Driven Performance Analysis
Kaikai Zhao
Zhaoxiang Liu
Xuejiao Lei
Rongjia Du
Zhenhong Long
...
Minjie Hua
Kai Wang
Wen Liu
Ning Wang
Kai Wang
ELM
LRM
109
1
0
16 Feb 2025
Safety Evaluation of DeepSeek Models in Chinese Contexts
Wenjing Zhang
Xuejiao Lei
Zhaoxiang Liu
Rongjia Du
Zhenhong Long
...
Jiaojiao Zhao
Minjie Hua
Chaoyang Ma
Kai Wang
Kai Wang
ELM
215
9
0
16 Feb 2025
PEA: Enhancing LLM Performance on Computational-Reasoning Tasks
Zi Wang
Shiwei Weng
Mohannad J. Alhanahnah
S. Jha
Tom Reps
LRM
ReLM
119
0
0
16 Feb 2025
Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models
Haoyang Li
Xuejia Chen
Zhanchao Xu
Darian Li
Nicole Hu
...
Yongbin Li
Luyu Qiu
C. Zhang
Qing Li
Lei Chen
ELM
LRM
120
1
0
16 Feb 2025
Typhoon T1: An Open Thai Reasoning Model
Pittawat Taveekitworachai
Potsawee Manakul
Kasima Tharnpipitchai
Kunat Pipatanakul
OffRL
LRM
305
0
0
13 Feb 2025
DeepSeek on a Trip: Inducing Targeted Visual Hallucinations via Representation Vulnerabilities
Chashi Mahiul Islam
Samuel Jacob Chacko
Preston Horne
Xiuwen Liu
194
2
0
11 Feb 2025
Trustworthy AI: Safety, Bias, and Privacy -- A Survey
Xingli Fang
Jianwei Li
Varun Mulchandani
Jung-Eun Kim
93
0
0
11 Feb 2025
When More is Less: Understanding Chain-of-Thought Length in LLMs
Yuyang Wu
Yifei Wang
Tianqi Du
Stefanie Jegelka
Yisen Wang
Yisen Wang
LRM
165
51
0
11 Feb 2025
Evaluating the Systematic Reasoning Abilities of Large Language Models through Graph Coloring
Alex Heyman
Joel Zylberberg
LRM
126
1
0
10 Feb 2025
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
L. Yang
Zhaochen Yu
Tengjiao Wang
Mengdi Wang
ReLM
LRM
AI4CE
194
18
0
10 Feb 2025
VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data
Thomas Zeng
Shuibai Zhang
Shutong Wu
Christian Classen
Daewon Chae
...
Jungtaek Kim
H. Koo
Kannan Ramchandran
Dimitris Papailiopoulos
Kangwook Lee
LRM
122
4
0
10 Feb 2025
Digital Twin Buildings: 3D Modeling, GIS Integration, and Visual Descriptions Using Gaussian Splatting, ChatGPT/Deepseek, and Google Maps Platform
K. Gao
Dening Lu
Liangzhi Li
Nan Chen
Hongjie He
Linlin Xu
Jonathan Li
3DGS
3DPC
AI4CE
199
1
0
09 Feb 2025
Enhancing Depression Detection with Chain-of-Thought Prompting: From Emotion to Reasoning Using Large Language Models
Shiyu Teng
Jiaqing Liu
R. Jain
Shurong Chai
Ruibo Hou
T. Tateyama
Lanfen Lin
Yuxiao Chen
AI4MH
LRM
99
1
0
09 Feb 2025
Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning
Libo Wang
LRM
484
3
0
07 Feb 2025
LLMs Can Teach Themselves to Better Predict the Future
Benjamin Turtel
Danny Franklin
Philipp Schoenegger
LRM
228
1
0
07 Feb 2025
Brief analysis of DeepSeek R1 and its implications for Generative AI
Sarah Mercer
Samuel Spillard
Daniel P. Martin
205
15
0
04 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
577
3
0
04 Feb 2025
OverThink: Slowdown Attacks on Reasoning LLMs
A. Kumar
Jaechul Roh
A. Naseh
Marzena Karpinska
Mohit Iyyer
Amir Houmansadr
Eugene Bagdasarian
LRM
186
25
0
04 Feb 2025
Can LLMs Maintain Fundamental Abilities under KV Cache Compression?
Xiang Liu
Zhenheng Tang
Hong Chen
Peijie Dong
Zeyu Li
Xiuze Zhou
Bo Li
Xuming Hu
Xiaowen Chu
505
7
0
04 Feb 2025
What is a Number, That a Large Language Model May Know It?
Raja Marjieh
Veniamin Veselovsky
Thomas Griffiths
Ilia Sucholutsky
463
3
0
03 Feb 2025
Explainable Sentiment Analysis with DeepSeek-R1: Performance, Efficiency, and Few-Shot Learning
Donghao Huang
Zhaoxia Wang
LRM
43
0
0
03 Feb 2025
Calling a Spade a Heart: Gaslighting Multimodal Large Language Models via Negation
Bin Zhu
Hui yan Qi
Yinxuan Gui
Jingjing Chen
Chong-Wah Ngo
Ee-Peng Lim
469
2
0
31 Jan 2025
GuardReasoner: Towards Reasoning-based LLM Safeguards
Yue Liu
Hongcheng Gao
Shengfang Zhai
Jun Xia
Tianyi Wu
Zhiwei Xue
Yuxiao Chen
Kenji Kawaguchi
Jiaheng Zhang
Bryan Hooi
AI4TS
LRM
288
26
0
30 Jan 2025
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented Generation
Satyapriya Krishna
Kalpesh Krishna
Anhad Mohananey
Steven Schwarcz
Adam Stambler
Shyam Upadhyay
Manaal Faruqui
ReLM
3DV
LRM
RALM
106
30
0
28 Jan 2025
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Tianzhe Chu
Yuexiang Zhai
Jihan Yang
Shengbang Tong
Saining Xie
Dale Schuurmans
Quoc V. Le
Sergey Levine
Yi-An Ma
OffRL
271
128
0
28 Jan 2025
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer
Lin Yueyu
Li Zhiyuan
Peter Yue
Liu Xiao
111
5
0
28 Jan 2025
FIT-Print: Towards False-claim-resistant Model Ownership Verification via Targeted Fingerprint
Shuo Shao
Haozhe Zhu
Hongwei Yao
Yiming Li
Minlie Huang
Zhan Qin
Kui Ren
AAML
488
0
0
26 Jan 2025
CodeMonkeys: Scaling Test-Time Compute for Software Engineering
Ryan Ehrlich
Bradley Brown
Jordan Juravsky
Ronald Clark
Christopher Ré
Azalia Mirhoseini
116
11
0
24 Jan 2025
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models
Samira Abnar
Harshay Shah
Dan Busbridge
Alaaeldin Mohamed Elnouby Ali
J. Susskind
Vimal Thilak
MoE
LRM
158
10
0
21 Jan 2025
Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues
Daniele Cipollone
Changjie Wang
Mariano Scazzariello
Simone Ferlin
Maliheh Izadi
Dejan Kostic
Marco Chiesa
44
0
0
10 Jan 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Xinzhe Ni
Zicheng Lin
...
Yiyao Yu
C. Shi
Ruihang Chu
Jin Zeng
Yujiu Yang
LRM
247
25
0
08 Jan 2025
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
Xingyu Chen
Jiahao Xu
Tian Liang
Zhiwei He
Jianhui Pang
...
Zizhuo Zhang
Rui Wang
Zhaopeng Tu
Haitao Mi
Dong Yu
LRM
ReLM
240
197
0
30 Dec 2024
Next Patch Prediction for Autoregressive Visual Generation
Yatian Pang
Peng Jin
Shuo Yang
Bin Lin
Bin Zhu
...
Liuhan Chen
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
255
10
0
19 Dec 2024
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion
Ximing Xing
Juncheng Hu
Jing Zhang
Dong Xu
Qian Yu
218
4
0
11 Dec 2024
Unifying KV Cache Compression for Large Language Models with LeanKV
Yanqi Zhang
Yuwei Hu
Runyuan Zhao
John C. S. Lui
Haibo Chen
MQ
290
7
0
04 Dec 2024
Enhancing Answer Reliability Through Inter-Model Consensus of Large Language Models
Alireza Amiri-Margavi
Iman Jebellat
Ehsan Jebellat
Seyed Pouyan Mousavi Davoudi
190
3
0
25 Nov 2024
Previous
1
2
3
...
25
26
27
Next