Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.16849
Cited By
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
22 December 2024
Yuxiang Zhang
Yuqi Yang
Jiangming Shu
Yuhang Wang
Jinlin Xiao
Jitao Sang
ALM
VLM
OffRL
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (140★)
Papers citing
"OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning"
6 / 6 papers shown
Title
Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning
Libo Wang
LRM
411
3
0
07 Feb 2025
o1-Coder: an o1 Replication for Coding
Yuxiang Zhang
Shangxi Wu
Yuqi Yang
Jiangming Shu
Jinlin Xiao
Chao Kong
Jitao Sang
LRM
130
48
0
29 Nov 2024
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Jian Hu
Xibin Wu
Weixun Wang
OpenLLMAI Team
Dehao Zhang
Yu Cao
AI4CE
VLM
88
130
0
20 May 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
880
12,973
0
04 Mar 2022
N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning
A. Ashok
Nicholas Rhinehart
Fares N. Beainy
Kris Kitani
62
170
0
18 Sep 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
517
19,065
0
20 Jul 2017
1