
BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
Papers citing "BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling"
27 / 27 papers shown
Title |
---|
![]() RRM: Robust Reward Model Training Mitigates Reward Hacking Tianqi Liu Wei Xiong Jie Jessie Ren Lichang Chen Junru Wu ...Yuan Liu Bilal Piot Abe Ittycheriah Aviral Kumar Mohammad Saleh |