ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.14476
  4. Cited By
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

18 March 2025
Qiying Yu
Zhe Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
Yu Yue
Tiantian Fan
Gaohong Liu
L. Liu
Xin Liu
H. Lin
Zhiqi Lin
Bole Ma
Guangming Sheng
Yuxuan Tong
Chenyi Zhang
Mofan Zhang
Wang Zhang
Hang Zhu
Jinhua Zhu
Jiaze Chen
Jiangjie Chen
Chunyang Wang
Hongli Yu
W. Dai
Yuxuan Song
Xiangpeng Wei
Hao Zhou
Jingjing Liu
Wei-Ying Ma
Ya-Qin Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
    OffRL
    LRM
ArXivPDFHTML

Papers citing "DAPO: An Open-Source LLM Reinforcement Learning System at Scale"

10 / 60 papers shown
Title
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang
Chao Qu
Zuming Huang
Wei Chu
Fangzhen Lin
Wenhu Chen
OffRL
ReLM
SyDa
LRM
VLM
80
2
0
10 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLM
ALM
LRM
100
9
0
09 Apr 2025
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
Qingyang Zhang
Haitao Wu
Changqing Zhang
Peilin Zhao
Yatao Bian
ReLM
LRM
87
5
0
08 Apr 2025
Learning Lie Group Generators from Trajectories
Learning Lie Group Generators from Trajectories
Lifan Hu
45
0
0
04 Apr 2025
Improved Visual-Spatial Reasoning via R1-Zero-Like Training
Improved Visual-Spatial Reasoning via R1-Zero-Like Training
Zhenyi Liao
Qingsong Xie
Yanhao Zhang
Zijian Kong
Haonan Lu
Zhenyu Yang
Zhijie Deng
ReLM
VLM
LRM
104
0
1
01 Apr 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Xiaoye Qu
Yafu Li
Zhaochen Su
Weigao Sun
Jianhao Yan
...
Chaochao Lu
Yue Zhang
Xian-Sheng Hua
Bowen Zhou
Yu Cheng
ReLM
OffRL
LRM
91
17
0
27 Mar 2025
Reasoning Beyond Limits: Advances and Open Problems for LLMs
Reasoning Beyond Limits: Advances and Open Problems for LLMs
M. Ferrag
Norbert Tihanyi
Merouane Debbah
ELM
OffRL
LRM
AI4CE
211
3
0
26 Mar 2025
Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning
Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning
Chen Li
Nazhou Liu
Kai Yang
46
3
0
20 Mar 2025
Atom of Thoughts for Markov LLM Test-Time Scaling
Atom of Thoughts for Markov LLM Test-Time Scaling
Fengwei Teng
Zhaoyang Yu
Quan Shi
Jiayi Zhang
Chenglin Wu
Yuyu Luo
MU
LRM
58
15
0
17 Feb 2025
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
Xupeng Miao
Gabriele Oliaro
Xinhao Cheng
Vineeth Kada
Ruohan Gao
...
April Yang
Yingcheng Wang
Mengdi Wu
Colin Unger
Zhihao Jia
MoE
94
9
0
29 Feb 2024
Previous
12