Title |
---|
![]() Towards a Unified View of Preference Learning for Large Language Models:
A Survey Bofei Gao Feifan Song Yibo Miao Zefan Cai Zhengyuan Yang ...Houfeng Wang Zhifang Sui Peiyi Wang Baobao Chang Baobao Chang |
![]() Online Merging Optimizers for Boosting Rewards and Mitigating Tax in
Alignment Keming Lu Bowen Yu Fei Huang Yang Fan Runji Lin Chang Zhou |