Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.11751
Cited By
reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs
14 March 2025
Zhaofeng Wu
Michihiro Yasunaga
Andrew Cohen
Yoon Kim
Asli Celikyilmaz
Marjan Ghazvininejad
Re-assign community
ArXiv
PDF
HTML
Papers citing
"reWordBench: Benchmarking and Improving the Robustness of Reward Models with Transformed Inputs"
2 / 2 papers shown
Title
HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages
ziqi wang
Jiaqi Zeng
Olivier Delalleau
Hoo-Chang Shin
Felipe Soares
Alexander Bukharin
Ellie Evans
Yi Dong
Oleksii Kuchaiev
22
0
0
16 May 2025
Adversarial Training of Reward Models
Alexander Bukharin
Haifeng Qian
Shengyang Sun
Adithya Renduchintala
Soumye Singhal
ziqi wang
Oleksii Kuchaiev
Olivier Delalleau
T. Zhao
AAML
32
0
0
08 Apr 2025
1