Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.12368
Cited By
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
21 January 2025
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
Shengyuan Ding
Shenxi Wu
Yubo Ma
Haodong Duan
Feiyu Xiong
Kai Chen
Dahua Lin
Jiaqi Wang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model"
15 / 15 papers shown
Title
MR. Judge: Multimodal Reasoner as a Judge
Renjie Pi
Felix Bai
Qibin Chen
Simon Wang
Jiulong Shan
Kieran Liu
Meng Cao
ELM
LRM
17
0
0
19 May 2025
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning
Ke Wang
Junting Pan
Linda Wei
Aojun Zhou
Weikang Shi
...
Han Xiao
Yiran Yang
Houxing Ren
Mingjie Zhan
Hongsheng Li
37
0
0
15 May 2025
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
Xiaokun Wang
Chris
Jiangbo Pei
Wei Shen
Yi Peng
...
Ai Jian
Tianyidan Xie
Xuchen Song
Yang Liu
Yahui Zhou
OffRL
LRM
33
0
0
12 May 2025
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Yibin Wang
Zhimin Li
Yuhang Zang
Chunyu Wang
Qinglin Lu
Cheng Jin
Jinqiao Wang
LRM
53
2
0
06 May 2025
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Yi-Fan Zhang
Xingyu Lu
X. Hu
Chaoyou Fu
Bin Wen
...
Jianfei Chen
Fan Yang
Z. Zhang
Tingting Gao
Liang Wang
OffRL
LRM
48
1
0
05 May 2025
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
Haozhan Shen
Peng Liu
Junlong Li
Chunxin Fang
Yibo Ma
...
Zilun Zhang
Kangjia Zhao
Qianqian Zhang
Ruochen Xu
Tiancheng Zhao
VLM
LRM
76
36
0
10 Apr 2025
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Hardy Chen
Haoqin Tu
Fali Wang
Hui Liu
Xianfeng Tang
Xinya Du
Yuyin Zhou
Cihang Xie
ReLM
VLM
OffRL
LRM
77
9
0
10 Apr 2025
MM-IFEngine: Towards Multimodal Instruction Following
Shengyuan Ding
Shenxi Wu
Xiangyu Zhao
Yuhang Zang
Haodong Duan
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Dahua Lin
Jiaqi Wang
OffRL
60
2
0
10 Apr 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao
Xuqi Liu
Zhongqi Yue
Y. Wu
Shuang Chen
Juncheng Billy Li
Siliang Tang
Fei Wu
Tat-Seng Chua
Yueting Zhuang
OffRL
LRM
44
1
0
09 Apr 2025
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Bingchen Miao
Y. Wu
Minghe Gao
Qifan Yu
Wendong Bu
Wenqiao Zhang
Yunfei Li
Siliang Tang
Tat-Seng Chua
Juncheng Billy Li
LLMAG
LRM
66
0
0
24 Mar 2025
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
Wan Ju Kang
Eunki Kim
Na Min An
Sangryul Kim
Haemin Choi
Ki Hoon Kwak
James Thorne
54
0
0
17 Mar 2025
From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Muzhi Dai
Jiashuo Sun
Zhiyuan Zhao
Shixuan Liu
Rui Li
Junyu Gao
Xuelong Li
VLM
55
1
0
08 Mar 2025
Unified Reward Model for Multimodal Understanding and Generation
Yibin Wang
Yuhang Zang
Hao Li
Cheng Jin
Rongxiang Weng
EGVM
78
5
0
07 Mar 2025
Visual-RFT: Visual Reinforcement Fine-Tuning
Ziyu Liu
Zeyi Sun
Yuhang Zang
Xiaoyi Dong
Yuhang Cao
Haodong Duan
Dahua Lin
Jiaqi Wang
ObjD
VLM
LRM
78
49
0
03 Mar 2025
Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study
Yujie Lin
Ante Wang
Moye Chen
Jingyao Liu
Hao Liu
Jinsong Su
Xinyan Xiao
LRM
50
2
0
17 Feb 2025
1