Secrets of RLHF in Large Language Models Part II: Reward Modeling

v1v2 (latest)

Secrets of RLHF in Large Language Models Part II: Reward Modeling

11 January 2024

Caishuang Huang

Wei Shen

Xipeng Qiu

Xuanjing Huang

Zuxuan Wu

ArXiv (abs)PDF HTML

Papers citing "Secrets of RLHF in Large Language Models Part II: Reward Modeling"

0 / 25 papers shown

Title
No papers

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.