Title |
---|
![]() Multi-turn Reinforcement Learning from Preference Human Feedback Lior Shani Aviv Rosenberg Asaf B. Cassel Oran Lang Daniele Calandriello ...Bilal Piot Idan Szpektor Avinatan Hassidim Yossi Matias Rémi Munos |
![]() RaFe: Ranking Feedback Improves Query Rewriting for RAG Shengyu Mao Yong-jia Jiang Boli Chen Xiao Li Peng Wang Xinyu Wang Pengjun Xie Fei Huang Huajun Chen Ningyu Zhang |
![]() Agent Planning with World Knowledge Model Shuofei Qiao Runnan Fang Ningyu Zhang Yuqi Zhu Xiang Chen Shumin Deng Yong-jia Jiang Pengjun Xie Fei Huang Huajun Chen |