Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2506.04280
Cited By
Evaluating MLLMs with Multimodal Multi-image Reasoning Benchmark
4 June 2025
Ziming Cheng
Binrui Xu
Lisheng Gong
Zuhe Song
Tianshuo Zhou
Shiqi Zhong
Siyu Ren
Mingxiang Chen
Xiangchao Meng
Y. Zhang
Yanlin Li
Lei Ren
Wei Chen
Zhiyuan Huang
Mingjie Zhan
Xiaojie Wang
Fangxiang Feng
VLM
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Evaluating MLLMs with Multimodal Multi-image Reasoning Benchmark"
24 / 24 papers shown
Title
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
169
128
1
14 Apr 2025
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
Minghe Gao
Xuqi Liu
Zhongqi Yue
Y. Wu
Shuang Chen
Juncheng Billy Li
Siliang Tang
Leilei Gan
Tat-Seng Chua
Yueting Zhuang
OffRL
LRM
74
5
0
09 Apr 2025
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Weiyun Wang
Zhangwei Gao
Lawrence Yunliang Chen
Zhe Chen
Jinguo Zhu
...
Lewei Lu
Haodong Duan
Yu Qiao
Jifeng Dai
Wenhai Wang
LRM
119
38
0
13 Mar 2025
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Omkar Thawakar
Dinura Dissanayake
Ketan More
Ritesh Thawkar
Ahmed Heakl
...
Hisham Cholakkal
Ivan Laptev
Mubarak Shah
Fahad Shahbaz Khan
Salman Khan
VLM
LRM
112
57
0
10 Jan 2025
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Weiyun Wang
Zhe Chen
Wenhai Wang
Yue Cao
Yangzhou Liu
...
Jinguo Zhu
X. Zhu
Lewei Lu
Yu Qiao
Jifeng Dai
LRM
127
91
1
15 Nov 2024
MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps
Xiongtao Zhou
Jie He
Lanyu Chen
Jingyu Li
Haojing Chen
Víctor Gutiérrez-Basulto
Jeff Z. Pan
Ningyu Zhang
LRM
125
2
0
18 Oct 2024
VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment
Lei Li
Zhihui Xie
Mukai Li
Shunian Chen
Peiyi Wang
L. Chen
Yazheng Yang
Benyou Wang
Dianbo Sui
Qiang Liu
VLM
ALM
82
28
0
12 Oct 2024
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li
Yuanhan Zhang
Dong Guo
Renrui Zhang
Feng Li
Hao Zhang
Kaichen Zhang
Yanwei Li
Ziwei Liu
Chunyuan Li
MLLM
SyDa
VLM
117
860
0
06 Aug 2024
MIBench: Evaluating Multimodal Large Language Models over Multiple Images
Haowei Liu
Xi Zhang
Haiyang Xu
Yaya Shi
Chaoya Jiang
...
Ji Zhang
Fei Huang
Chunfen Yuan
Bing Li
Weiming Hu
VLM
71
15
0
21 Jul 2024
ReMI: A Dataset for Reasoning with Multiple Images
Mehran Kazemi
Nishanth Dikkala
Ankit Anand
Petar Dević
Ishita Dasgupta
...
Bahare Fatemi
Pranjal Awasthi
Dee Guo
Sreenivas Gollapudi
Ahmed Qureshi
LRM
VLM
103
17
0
13 Jun 2024
M
3
^3
3
CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought
Qiguang Chen
Libo Qin
Jin Zhang
Zhi Chen
Xiao Xu
Wanxiang Che
LRM
101
61
0
26 May 2024
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning
Yuanhan Zhang
Kaichen Zhang
Yue Liu
Fanyi Pu
Christopher Arif Setiadharma
Jingkang Yang
Ziwei Liu
VGen
88
10
0
06 May 2024
MANTIS: Interleaved Multi-Image Instruction Tuning
Dongfu Jiang
Xuan He
Huaye Zeng
Cong Wei
Max Ku
Qian Liu
Wenhu Chen
VLM
MLLM
87
125
0
02 May 2024
MileBench: Benchmarking MLLMs in Long Context
Dingjie Song
Shunian Chen
Guiming Hardy Chen
Fei Yu
Xiang Wan
Benyou Wang
VLM
125
44
0
29 Apr 2024
BLINK: Multimodal Large Language Models Can See but Not Perceive
Xingyu Fu
Yushi Hu
Bangzheng Li
Yu Feng
Haoyu Wang
Xudong Lin
Dan Roth
Noah A. Smith
Wei-Chiu Ma
Ranjay Krishna
VLM
LRM
MLLM
94
149
0
18 Apr 2024
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Kunchang Li
Yali Wang
Yinan He
Yizhuo Li
Yi Wang
...
Jilan Xu
Guo Chen
Ping Luo
Limin Wang
Yu Qiao
VLM
MLLM
141
503
0
28 Nov 2023
Aligning Large Multimodal Models with Factually Augmented RLHF
Zhiqing Sun
Sheng Shen
Shengcao Cao
Haotian Liu
Chunyuan Li
...
Liangyan Gui
Yu-Xiong Wang
Yiming Yang
Kurt Keutzer
Trevor Darrell
VLM
119
394
0
25 Sep 2023
Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision
Haoning Wu
Zicheng Zhang
Erli Zhang
Chaofeng Chen
Liang Liao
...
Chunyi Li
Wenxiu Sun
Qiong Yan
Guangtao Zhai
Weisi Lin
VLM
91
155
0
25 Sep 2023
MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
Henghui Ding
Chang Liu
Shuting He
Xudong Jiang
Chen Change Loy
VOS
110
116
0
16 Aug 2023
Fine-tuning Multimodal LLMs to Follow Zero-shot Demonstrative Instructions
Juncheng Li
Kaihang Pan
Zhiqi Ge
Minghe Gao
Wei Ji
Wenqiao Zhang
Tat-Seng Chua
Siliang Tang
Hanwang Zhang
Yueting Zhuang
MLLM
73
73
0
08 Aug 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
387
4,125
0
29 May 2023
Multimodal Chain-of-Thought Reasoning in Language Models
Zhuosheng Zhang
Aston Zhang
Mu Li
Hai Zhao
George Karypis
Alexander J. Smola
LRM
109
464
0
02 Feb 2023
NLVR2 Visual Bias Analysis
Alane Suhr
Yoav Artzi
39
13
0
23 Sep 2019
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
529
19,237
0
20 Jul 2017
1