Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.00765
Cited By
MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities
1 August 2024
Weihao Yu
Zhengyuan Yang
Linfeng Ren
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
Chung-Ching Lin
Zicheng Liu
Lijuan Wang
Xinchao Wang
VLM
MLLM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities"
12 / 12 papers shown
Title
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models
Byung-Kwan Lee
Ryo Hachiuma
Yong Man Ro
Yu-Chun Wang
Yueh-Hua Wu
VLM
38
0
0
18 Jun 2025
ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs
Xiyao Wang
Zhengyuan Yang
Chao Feng
Yongyuan Liang
Yuhang Zhou
...
Chung-Ching Lin
Kevin Lin
Linjie Li
Furong Huang
L. xilinx Wang
OffRL
LRM
57
0
0
11 Jun 2025
FlagEvalMM: A Flexible Framework for Comprehensive Multimodal Model Evaluation
Zheqi He
Yesheng Liu
Jing-shu Zheng
Xuejing Li
Richeng Xuan
Jin-Ge Yao
Xi Yang
Xi Yang
MLLM
VLM
44
0
0
10 Jun 2025
Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning
Tianyi Bai
Yuxuan Fan
Jiantao Qiu
Fupeng Sun
Jiayi Song
Junlin Han
Zichen Liu
Conghui He
Wentao Zhang
Binhang Yuan
MLLM
VLM
23
0
0
08 Jun 2025
OMNIGUARD: An Efficient Approach for AI Safety Moderation Across Modalities
Sahil Verma
Keegan E. Hines
J. Bilmes
Charlotte Siska
Luke Zettlemoyer
Hila Gonen
Chandan Singh
AAML
17
0
0
29 May 2025
Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning
Minheng Ni
Zhengyuan Yang
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
W. Zuo
Lijuan Wang
ReLM
LRM
85
1
0
26 May 2025
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities
Wei Wei
Jintao Guo
Shanshan Zhao
Minghao Fu
Lunhao Duan
...
Guo-Hua Wang
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
DiffM
301
1
0
05 May 2025
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models
Jinguo Zhu
Weiyun Wang
Zhe Chen
Ziwei Liu
Shenglong Ye
...
Dahua Lin
Yu Qiao
Jifeng Dai
Wenhai Wang
Wei Wang
MLLM
VLM
221
132
1
14 Apr 2025
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement
Xinze Wang
Zhiyong Yang
Chao Feng
Hongjin Lu
Linjie Li
Chung-Ching Lin
Kevin Qinghong Lin
Furong Huang
Lijuan Wang
OODD
ReLM
LRM
VLM
214
19
0
10 Apr 2025
V-MAGE: A Game Evaluation Framework for Assessing Vision-Centric Capabilities in Multimodal Large Language Models
Xiangxi Zheng
Linjie Li
Zhiyong Yang
Ping Yu
Alex Jinpeng Wang
Rui Yan
Yuan Yao
Lijuan Wang
LRM
67
1
0
08 Apr 2025
RGB-Th-Bench: A Dense benchmark for Visual-Thermal Understanding of Vision Language Models
Mehdi Moshtaghi
Siavash H. Khajavi
Joni Pajarinen
VLM
148
0
0
25 Mar 2025
AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?
Han Bao
Yue Huang
Yanbo Wang
Jiayi Ye
Xiangqi Wang
Preslav Nakov
Mohamed Elhoseiny
Wei Wei
Mohamed Elhoseiny
Xiangliang Zhang
109
11
0
28 Oct 2024
1