Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.15096
Cited By
GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets
19 October 2024
Oh Joon Kwon
Daiki E. Matsunaga
Kee-Eung Kim
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"GDPO: Learning to Directly Align Language Models with Diversity Using GFlowNets"
1 / 1 papers shown
Title
DPO Meets PPO: Reinforced Token Optimization for RLHF
Han Zhong
Zikang Shan
Guhao Feng
Wei Xiong
Xinle Cheng
Li Zhao
Di He
Jiang Bian
Liwei Wang
147
72
0
29 Apr 2024
1