Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.09893
Cited By
RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
13 October 2024
Enyu Zhou
Guodong Zheng
B. Wang
Zhiheng Xi
Shihan Dou
Rong Bao
Wei Shen
Limao Xiong
Jessica Fan
Yurong Mou
Rui Zheng
Tao Gui
Qi Zhang
Xuanjing Huang
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"RMB: Comprehensively Benchmarking Reward Models in LLM Alignment"
14 / 14 papers shown
Title
WorldPM: Scaling Human Preference Modeling
B. Wang
Runji Lin
K. Lu
L. Yu
Z. Zhang
...
Xuanjing Huang
Yu Jiang
Bowen Yu
J. Zhou
Junyang Lin
19
0
0
15 May 2025
Self-Consuming Generative Models with Adversarially Curated Data
Xiukun Wei
Xueru Zhang
WIGM
34
0
0
14 May 2025
On the Robustness of Reward Models for Language Model Alignment
Jiwoo Hong
Noah Lee
Eunki Kim
Guijin Son
Woojin Chung
Aman Gupta
Shao Tang
James Thorne
29
0
0
12 May 2025
RM-R1: Reward Modeling as Reasoning
X. Chen
Gaotang Li
Z. Wang
Bowen Jin
Cheng Qian
...
Y. Zhang
D. Zhang
Tong Zhang
Hanghang Tong
Heng Ji
ReLM
OffRL
LRM
156
0
0
05 May 2025
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Yilun Zhou
Austin Xu
Peifeng Wang
Caiming Xiong
Shafiq R. Joty
ELM
ALM
LRM
50
2
0
21 Apr 2025
Energy-Based Reward Models for Robust Language Model Alignment
Anamika Lochab
Ruqi Zhang
130
0
0
17 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
45
2
0
12 Apr 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao
Xuqi Liu
Zhongqi Yue
Y. Wu
Shuang Chen
Juncheng Billy Li
Siliang Tang
Fei Wu
Tat-Seng Chua
Yueting Zhuang
OffRL
LRM
39
1
0
09 Apr 2025
Inference-Time Scaling for Generalist Reward Modeling
Zijun Liu
P. Wang
R. Xu
Shirong Ma
Chong Ruan
Peng Li
Yang Janet Liu
Y. Wu
OffRL
LRM
46
10
0
03 Apr 2025
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Weiyun Wang
Zhangwei Gao
L. Chen
Zhe Chen
Jinguo Zhu
...
Lewei Lu
Haodong Duan
Yu Qiao
Jifeng Dai
Wenhai Wang
LRM
60
10
0
13 Mar 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
...
Haodong Duan
W. Zhang
Kai Chen
D. Lin
Jiaqi Wang
VLM
72
19
0
21 Jan 2025
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
Zhuoran Jin
Hongbang Yuan
Tianyi Men
Pengfei Cao
Yubo Chen
Kang-Jun Liu
Jun Zhao
ALM
82
7
0
18 Dec 2024
VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
Lei Li
Y. X. Wei
Zhihui Xie
Xuqing Yang
Yifan Song
...
Tianyu Liu
Sujian Li
Bill Yuchen Lin
Lingpeng Kong
Q. Liu
CoGe
VLM
120
24
0
26 Nov 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja
Lester James Validad Miranda
Shayekh Bin Islam
Rishabh Maheshwary
Drishti Sharma
Gusti Winata
Nathan Lambert
Sebastian Ruder
Sara Hooker
Marzieh Fadaee
LRM
35
15
0
20 Oct 2024
1