ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.17565
  4. Cited By
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training
v1v2v3 (latest)

DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training

24 April 2025
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
    LRM
ArXiv (abs)PDFHTML

Papers citing "DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training"

11 / 11 papers shown
Title
Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning
Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning
Zhaohui Yang
Shilei Jiang
Chen Hu
Linjing Li
Shihong Deng
D. Jiang
OffRL
105
0
0
20 May 2025
Not All Correct Answers Are Equal: Why Your Distillation Source Matters
Not All Correct Answers Are Equal: Why Your Distillation Source Matters
Xiaoyu Tian
Yunjie Ji
Haotian Wang
Shuaiting Chen
Sitong Zhao
Yiping Peng
Han Zhao
Xiangang Li
LRM
126
0
0
20 May 2025
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
OffRLLRM
81
0
0
04 May 2025
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study
Yunjie Ji
Sitong Zhao
Xiaoyu Tian
Haotian Wang
Shuaiting Chen
Yiping Peng
Han Zhao
Xiangang Li
LRM
116
3
0
01 Apr 2025
1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training
1.4 Million Open-Source Distilled Reasoning Dataset to Empower Large Language Model Training
Han Zhao
Haotian Wang
Yiping Peng
Sitong Zhao
Xiaoyu Tian
Shuaiting Chen
Yunjie Ji
Xiangang Li
RALMReLMLRM
161
16
0
25 Mar 2025
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yunjie Ji
Yiping Peng
Han Zhao
Xiangang Li
ReLMELMLRM
117
10
0
25 Mar 2025
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning
D3: Diversity, Difficulty, and Dependability-Aware Data Selection for Sample-Efficient LLM Instruction Tuning
Jia Zhang
Chen-Xi Zhang
Yang Liu
Yi-Xuan Jin
Xiao-Wen Yang
Bo Zheng
Yi Liu
Lan-Zhe Guo
143
3
0
14 Mar 2025
KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
Zhangchen Xu
Yang Liu
Yueqin Yin
Mingyuan Zhou
Radha Poovendran
ALMOffRL
130
18
0
04 Mar 2025
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Alon Albalak
Duy Phung
Nathan Lile
Rafael Rafailov
Kanishk Gandhi
...
Anikait Singh
Chase Blagden
Violet Xiang
Dakota Mahan
Nick Haber
OffRLLRM
100
16
0
24 Feb 2025
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Weizhe Yuan
Jane Dwivedi-Yu
Song Jiang
Karthik Padthe
Yang Li
...
Ilia Kulikov
Kyunghyun Cho
Yuandong Tian
Jason Weston
Xian Li
ReLMLRM
166
20
0
18 Feb 2025
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang
Tianhao Cheng
J.K. Liu
Jiaran Hao
L. Song
...
Ge Zhang
Zili Wang
Yuan Qi
Yinghui Xu
Wei Chu
ALM
218
31
0
07 Nov 2024
1