
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski
Boris Shaposhnikov
Alexey Malakhov
Nikita Surnachev
Yaroslav Aksenov
Ian Maksimov
Nikita Balagansky
Daniil Gavrilov
Papers citing "Learn Your Reference Model for Real Good Alignment"
50 / 78 papers shown
Title |
---|
![]() Towards a Unified View of Preference Learning for Large Language Models:
A Survey Bofei Gao Feifan Song Yibo Miao Zefan Cai Zhiyong Yang ...Houfeng Wang Zhifang Sui Peiyi Wang Baobao Chang Baobao Chang |
![]() Secrets of RLHF in Large Language Models Part II: Reward Modeling Bing Wang Rui Zheng Luyao Chen Yan Liu Shihan Dou ...Qi Zhang Xipeng Qiu Xuanjing Huang Zuxuan Wu Yuanyuan Jiang |
![]() Nash Learning from Human Feedback Rémi Munos Michal Valko Daniele Calandriello M. G. Azar Mark Rowland ...Nikola Momchev Olivier Bachem D. Mankowitz Doina Precup Bilal Piot |
![]() Llama 2: Open Foundation and Fine-Tuned Chat Models Hugo Touvron Louis Martin Kevin R. Stone Peter Albert Amjad Almahairi ...Sharan Narang Aurelien Rodriguez Robert Stojnic Sergey Edunov Thomas Scialom |