Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.17565
Cited By
v1
v2
v3 (latest)
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training
24 April 2025
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training"
11 / 11 papers shown
Title
Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning
Zhaohui Yang
Shilei Jiang
Chen Hu
Linjing Li
Shihong Deng
D. Jiang
OffRL
105
0
0
20 May 2025
Not All Correct Answers Are Equal: Why Your Distillation Source Matters
Xiaoyu Tian
Yunjie Ji
Haotian Wang
Shuaiting Chen
Sitong Zhao
Yiping Peng
Han Zhao
Xiangang Li
LRM
126
0
0
20 May 2025
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
OffRL
LRM
81
0
0
04 May 2025
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study
Yunjie Ji
Sitong Zhao
Xiaoyu Tian
Haotian Wang
Shuaiting Chen
Yiping Peng
Han Zhao
Xiangang Li
LRM
116
3
0
01 Apr 2025
1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training
Han Zhao
Haotian Wang
Yiping Peng
Sitong Zhao
Xiaoyu Tian
Shuaiting Chen
Yunjie Ji
Xiangang Li
RALM
ReLM
LRM
161
16
0
25 Mar 2025
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yunjie Ji
Yiping Peng
Han Zhao
Xiangang Li
ReLM
ELM
LRM
117
10
0
25 Mar 2025
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning
Jia Zhang
Chen-Xi Zhang
Yang Liu
Yi-Xuan Jin
Xiao-Wen Yang
Bo Zheng
Yi Liu
Lan-Zhe Guo
143
3
0
14 Mar 2025
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
Zhangchen Xu
Yang Liu
Yueqin Yin
Mingyuan Zhou
Radha Poovendran
ALM
OffRL
130
18
0
04 Mar 2025
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Alon Albalak
Duy Phung
Nathan Lile
Rafael Rafailov
Kanishk Gandhi
...
Anikait Singh
Chase Blagden
Violet Xiang
Dakota Mahan
Nick Haber
OffRL
LRM
100
16
0
24 Feb 2025
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Weizhe Yuan
Jane Dwivedi-Yu
Song Jiang
Karthik Padthe
Yang Li
...
Ilia Kulikov
Kyunghyun Cho
Yuandong Tian
Jason Weston
Xian Li
ReLM
LRM
166
20
0
18 Feb 2025
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang
Tianhao Cheng
J.K. Liu
Jiaran Hao
L. Song
...
Ge Zhang
Zili Wang
Yuan Qi
Yinghui Xu
Wei Chu
ALM
218
31
0
07 Nov 2024
1