Title |
---|
![]() Secrets of RLHF in Large Language Models Part II: Reward Modeling Bing Wang Rui Zheng Luyao Chen Yan Liu Shihan Dou ...Qi Zhang Xipeng Qiu Xuanjing Huang Zuxuan Wu Yuanyuan Jiang |
![]() WebArena: A Realistic Web Environment for Building Autonomous Agents Shuyan Zhou Frank F. Xu Hao Zhu Xuhui Zhou Robert Lo ...Tianyue Ou Yonatan Bisk Daniel Fried Uri Alon Graham Neubig |