Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.04509
Cited By
ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection
6 October 2024
Yibo Yan
Shen Wang
Jiahao Huo
Hang Li
B. Li
Jiamin Su
Xiong Gao
Yi-Fan Zhang
Tianlong Xu
Zhendong Chu
Aoxiao Zhong
Kun Wang
Hui Xiong
Philip S. Yu
Xuming Hu
Qingsong Wen
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection"
10 / 10 papers shown
Title
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency
Zhikai Wang
Jiashuo Sun
W. Zhang
Zhiqiang Hu
Xin Li
F. Wang
Deli Zhao
VLM
LRM
75
0
0
24 Apr 2025
MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models
Wulin Xie
Y. Zhang
Chaoyou Fu
Yang Shi
Bingyan Nie
Hongkai Chen
Z. Zhang
Liang Wang
T. Tan
33
1
0
04 Apr 2025
MathAgent: Leveraging a Mixture-of-Math-Agent Framework for Real-World Multimodal Mathematical Error Detection
Yibo Yan
Shen Wang
Jiahao Huo
Philip S. Yu
Xuming Hu
Qingsong Wen
126
1
0
23 Mar 2025
LLM Agents for Education: Advances and Applications
Zhendong Chu
Shen Wang
Jian Xie
Tinghui Zhu
Yibo Yan
...
Aoxiao Zhong
Xuming Hu
Jing Liang
Philip S. Yu
Qingsong Wen
LLMAG
ELM
116
1
0
14 Mar 2025
From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education
Yi-Fan Zhang
Hang Li
D. Song
Lichao Sun
Tianlong Xu
Qingsong Wen
LLMAG
LRM
85
2
0
20 Feb 2025
SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning
Junkai Chen
Zhijie Deng
Kening Zheng
Yibo Yan
Shuliang Liu
PeiJun Wu
Peijie Jiang
J. Liu
Xuming Hu
MU
61
3
0
18 Feb 2025
EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models
Jiamin Su
Yibo Yan
Fangteng Fu
H. Zhang
Jingheng Ye
Xiang Liu
Jiahao Huo
Huiyu Zhou
Xuming Hu
ELM
52
0
0
17 Feb 2025
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Ruilin Luo
Zhuofan Zheng
Yifan Wang
Yiyao Yu
Xinzhe Ni
Zicheng Lin
Jin Zeng
Yujiu Yang
LRM
70
13
0
08 Jan 2025
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Mingyang Song
Zhaochen Su
Xiaoye Qu
Jiawei Zhou
Yu-Xi Cheng
LRM
53
29
0
06 Jan 2025
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
101
14
0
03 Dec 2024
1