ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.10604
26
0

MIRAGE: A Multi-modal Benchmark for Spatial Perception, Reasoning, and Intelligence

15 May 2025
Chonghan Liu
Haoran Wang
Felix Henry
Pu Miao
Yajie Zhang
Yu Zhao
Peiran Wu
    VLM
ArXivPDFHTML
Abstract

Spatial perception and reasoning are core components of human cognition, encompassing object recognition, spatial relational understanding, and dynamic reasoning. Despite progress in computer vision, existing benchmarks reveal significant gaps in models' abilities to accurately recognize object attributes and reason about spatial relationships, both essential for dynamic reasoning. To address these limitations, we propose MIRAGE, a multi-modal benchmark designed to evaluate models' capabilities in Counting (object attribute recognition), Relation (spatial relational reasoning), and Counting with Relation. Through diverse and complex scenarios requiring fine-grained recognition and reasoning, MIRAGE highlights critical limitations in state-of-the-art models, underscoring the need for improved representations and reasoning frameworks. By targeting these foundational abilities, MIRAGE provides a pathway toward spatiotemporal reasoning in future research.

View on arXiv
@article{liu2025_2505.10604,
  title={ MIRAGE: A Multi-modal Benchmark for Spatial Perception, Reasoning, and Intelligence },
  author={ Chonghan Liu and Haoran Wang and Felix Henry and Pu Miao and Yajie Zhang and Yu Zhao and Peiran Wu },
  journal={arXiv preprint arXiv:2505.10604},
  year={ 2025 }
}
Comments on this paper