Title |
---|
![]() Toward Optimal LLM Alignments Using Two-Player Games Rui Zheng Hongyi Guo Zhihan Liu Xiaoying Zhang Yuanshun Yao ...Tao Gui Qi Zhang Xuanjing Huang Hang Li Yang Liu |
![]() CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph Haitao Lin Guojiang Zhao Odin Zhang Yufei Huang Lirong Wu Zicheng Liu Siyuan Li Cheng Tan Zhifeng Gao Stan Z. Li |
![]() Unpacking DPO and PPO: Disentangling Best Practices for Learning from
Preference Feedback Hamish Ivison Yizhong Wang Jiacheng Liu Zeqiu Wu Valentina Pyatkin Nathan Lambert Noah A. Smith Yejin Choi Hannaneh Hajishirzi |