ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2412.16849
  4. Cited By
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks
  with Reinforcement Fine-Tuning

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

22 December 2024
Yuxiang Zhang
Yuqi Yang
Jiangming Shu
Yuhang Wang
Jinlin Xiao
Jitao Sang
    ALMVLMOffRLLRM
ArXiv (abs)PDFHTMLGithub (140★)

Papers citing "OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning"

6 / 6 papers shown
Title
Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning
Dynamic Chain-of-Thought: Towards Adaptive Deep Reasoning
Libo Wang
LRM
411
3
0
07 Feb 2025
o1-Coder: an o1 Replication for Coding
Yuxiang Zhang
Shangxi Wu
Yuqi Yang
Jiangming Shu
Jinlin Xiao
Chao Kong
Jitao Sang
LRM
130
48
0
29 Nov 2024
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Jian Hu
Xibin Wu
Weixun Wang
OpenLLMAI Team
Dehao Zhang
Yu Cao
AI4CEVLM
88
130
0
20 May 2024
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
880
12,973
0
04 Mar 2022
N2N Learning: Network to Network Compression via Policy Gradient
  Reinforcement Learning
N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning
A. Ashok
Nicholas Rhinehart
Fares N. Beainy
Kris Kitani
62
170
0
18 Sep 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
517
19,065
0
20 Jul 2017
1