Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.00754
Cited By
Efficient RLHF: Reducing the Memory Usage of PPO
1 September 2023
Michael Santacroce
Yadong Lu
Han Yu
Yuan-Fang Li
Yelong Shen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Efficient RLHF: Reducing the Memory Usage of PPO"
6 / 6 papers shown
Title
Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning
Chongyu Fan
Jiancheng Liu
Licong Lin
Jinghan Jia
Ruiqi Zhang
Song Mei
Sijia Liu
MU
205
36
0
09 Oct 2024
HybridFlow: A Flexible and Efficient RLHF Framework
Guangming Sheng
Chi Zhang
Zilingfeng Ye
Xibin Wu
Wang Zhang
Ru Zhang
Size Zheng
Haibin Lin
Chuan Wu
AI4CE
241
243
0
28 Sep 2024
Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang
Bo Huang
Yufei Wang
Xingshan Zeng
Liangyou Li
Yasheng Wang
Xin Jiang
Lifeng Shang
Ruiming Tang
Wei Wang
131
7
0
14 Aug 2024
A Survey on LoRA of Large Language Models
Yuren Mao
Yuhang Ge
Yijiang Fan
Wenyi Xu
Yu Mi
Zhonghao Hu
Yunjun Gao
ALM
191
41
0
08 Jul 2024
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Shusheng Xu
Wei Fu
Jiaxuan Gao
Wenjie Ye
Weiling Liu
Zhiyu Mei
Guangju Wang
Chao Yu
Yi Wu
182
165
0
16 Apr 2024
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Siyan Zhao
John Dang
Aditya Grover
91
30
0
17 Oct 2023
1