ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.11026
  4. Cited By
Simplify RLHF as Reward-Weighted SFT: A Variational Method

Simplify RLHF as Reward-Weighted SFT: A Variational Method

20 February 2025
Yuhao Du
Zehan Li
Pengyu Cheng
Zhihong Chen
Yuejiao Xie
Xiang Wan
Anningzhe Gao
ArXivPDFHTML

Papers citing "Simplify RLHF as Reward-Weighted SFT: A Variational Method"

1 / 1 papers shown
Title
Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm
Zehan Li
Yuhao Du
Xiaoqi Jiao
Yiwen Guo
Yuege Feng
Xiang Wan
Anningzhe Gao
Jinpeng Hu
68
0
0
04 Mar 2025
1