ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2501.12368
  4. Cited By
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

21 January 2025
Yuhang Zang
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Ziyu Liu
Shengyuan Ding
Shenxi Wu
Yubo Ma
Haodong Duan
Feiyu Xiong
Kai Chen
Dahua Lin
Jiaqi Wang
    VLM
ArXivPDFHTML

Papers citing "InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model"

16 / 16 papers shown
Title
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward
Kaixuan Fan
Kaituo Feng
Haoming Lyu
Dongzhan Zhou
Xiangyu Yue
ReLM
LRM
10
0
0
22 May 2025
MR. Judge: Multimodal Reasoner as a Judge
MR. Judge: Multimodal Reasoner as a Judge
Renjie Pi
Felix Bai
Qibin Chen
Simon Wang
Jiulong Shan
Kieran Liu
Meng Cao
ELM
LRM
21
0
0
19 May 2025
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning
MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning
Ke Wang
Junting Pan
Linda Wei
Aojun Zhou
Weikang Shi
...
Han Xiao
Yiran Yang
Houxing Ren
Mingjie Zhan
Hongsheng Li
37
0
0
15 May 2025
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
Xiaokun Wang
Chris
Jiangbo Pei
Wei Shen
Yi Peng
...
Ai Jian
Tianyidan Xie
Xuchen Song
Yang Liu
Yahui Zhou
OffRL
LRM
33
0
0
12 May 2025
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
Yibin Wang
Zhimin Li
Yuhang Zang
Chunyu Wang
Qinglin Lu
Cheng Jin
Jinqiao Wang
LRM
53
2
0
06 May 2025
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Yi-Fan Zhang
Xingyu Lu
X. Hu
Chaoyou Fu
Bin Wen
...
Zheyu Shen
Fan Yang
Z. Zhang
Yan Li
Liang Wang
OffRL
LRM
48
1
0
05 May 2025
MM-IFEngine: Towards Multimodal Instruction Following
MM-IFEngine: Towards Multimodal Instruction Following
Shengyuan Ding
Shenxi Wu
Xiangyu Zhao
Yuhang Zang
Haodong Duan
Xiaoyi Dong
Pan Zhang
Yuhang Cao
Dahua Lin
Jiaqi Wang
OffRL
60
2
0
10 Apr 2025
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model
Haozhan Shen
Peng Liu
Jianxin Li
Chunxin Fang
Yibo Ma
...
Zilun Zhang
Kangjia Zhao
Qianqian Zhang
Ruochen Xu
Tiancheng Zhao
VLM
LRM
76
36
0
10 Apr 2025
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Hardy Chen
Haoqin Tu
Fali Wang
Hui Liu
Xianfeng Tang
Xinya Du
Yuyin Zhou
Cihang Xie
ReLM
VLM
OffRL
LRM
77
9
0
10 Apr 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao
Xuqi Liu
Zhongqi Yue
Y. Wu
Shuang Chen
Juncheng Billy Li
Siliang Tang
Fei Wu
Tat-Seng Chua
Yueting Zhuang
OffRL
LRM
44
1
0
09 Apr 2025
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Boosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Bingchen Miao
Y. Wu
Minghe Gao
Qifan Yu
Wendong Bu
Wenqiao Zhang
Yunfei Li
Siliang Tang
Tat-Seng Chua
Juncheng Billy Li
LLMAG
LRM
66
0
0
24 Mar 2025
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
Wan Ju Kang
Eunki Kim
Na Min An
Sangryul Kim
Haemin Choi
Ki Hoon Kwak
James Thorne
54
0
0
17 Mar 2025
From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Muzhi Dai
Jiashuo Sun
Zhiyuan Zhao
Shixuan Liu
Rui Li
Junyu Gao
Xuelong Li
VLM
58
1
0
08 Mar 2025
Unified Reward Model for Multimodal Understanding and Generation
Yibin Wang
Yuhang Zang
Hao Li
Cheng Jin
Rongxiang Weng
EGVM
78
5
0
07 Mar 2025
Visual-RFT: Visual Reinforcement Fine-Tuning
Ziyu Liu
Zeyi Sun
Yuhang Zang
Xiaoyi Dong
Yuhang Cao
Haodong Duan
Dahua Lin
Jiaqi Wang
ObjD
VLM
LRM
80
49
0
03 Mar 2025
Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study
Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study
Yujie Lin
Ante Wang
Moye Chen
Jingyao Liu
Hao Liu
Jinsong Su
Xinyan Xiao
LRM
52
2
0
17 Feb 2025
1