CAUSAL3D: A Comprehensive Benchmark for Causal Learning from Visual Data

6 March 2025

Abstract

True intelligence hinges on the ability to uncover and leverage hidden causal relations. Despite significant progress in AI and computer vision (CV), there remains a lack of benchmarks for assessing models' abilities to infer latent causality from complex visual data. In this paper, we introduce \textsc{\textbf{Causal3D}}, a novel and comprehensive benchmark that integrates structured data (tables) with corresponding visual representations (images) to evaluate causal reasoning. Designed within a systematic framework, Causal3D comprises 19 3D-scene datasets capturing diverse causal relations, views, and backgrounds, enabling evaluations across scenes of varying complexity. We assess multiple state-of-the-art methods, including classical causal discovery, causal representation learning, and large/vision-language models (LLMs/VLMs). Our experiments show that as causal structures grow more complex without prior knowledge, performance declines significantly, highlighting the challenges even advanced methods face in complex causal scenarios. Causal3D serves as a vital resource for advancing causal reasoning in CV and fostering trustworthy AI in critical domains.

View on arXiv

@article{liu2025_2503.04852,
  title={ CAUSAL3D: A Comprehensive Benchmark for Causal Learning from Visual Data },
  author={ Disheng Liu and Yiran Qiao and Wuche Liu and Yiren Lu and Yunlai Zhou and Tuo Liang and Yu Yin and Jing Ma },
  journal={arXiv preprint arXiv:2503.04852},
  year={ 2025 }
}

Comments on this paper