Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.19255
Cited By
Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
20 May 2025
Jiawei Huang
Bingcong Li
Christoph Dann
Niao He
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective"
Title
No papers