Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.05875
Cited By
Towards Improved Preference Optimization Pipeline: from Data Generation to Budget-Controlled Regularization
7 November 2024
Zhuotong Chen
Fang Liu
Jennifer Zhu
Wanyu Du
Yanjun Qi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards Improved Preference Optimization Pipeline: from Data Generation to Budget-Controlled Regularization"
2 / 2 papers shown
Title
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
OffRL
129
35
0
15 Apr 2024
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
176
403
0
06 Apr 2024
1