Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.11026
Cited By
Simplify RLHF as Reward-Weighted SFT: A Variational Method
20 February 2025
Yuhao Du
Zehan Li
Pengyu Cheng
Zhihong Chen
Yuejiao Xie
Xiang Wan
Anningzhe Gao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Simplify RLHF as Reward-Weighted SFT: A Variational Method"
1 / 1 papers shown
Title
Add-One-In: Incremental Sample Selection for Large Language Models via a Choice-Based Greedy Paradigm
Zehan Li
Yuhao Du
Xiaoqi Jiao
Yiwen Guo
Yuege Feng
Xiang Wan
Anningzhe Gao
Jinpeng Hu
68
0
0
04 Mar 2025
1