Escaping Plato's Cave: Robust Conceptual Reasoning through Interpretable 3D Neural Object Volumes

17 March 2025

Abstract

With the rise of neural networks, especially in high-stakes applications, these networks need two properties (i) robustness and (ii) interpretability to ensure their safety. Recent advances in classifiers with 3D volumetric object representations have demonstrated a greatly enhanced robustness in out-of-distribution data. However, these 3D-aware classifiers have not been studied from the perspective of interpretability. We introduce CAVE - Concept Aware Volumes for Explanations - a new direction that unifies interpretability and robustness in image classification. We design an inherently-interpretable and robust classifier by extending existing 3D-aware classifiers with concepts extracted from their volumetric representations for classification. In an array of quantitative metrics for interpretability, we compare against different concept-based approaches across the explainable AI literature and show that CAVE discovers well-grounded concepts that are used consistently across images, while achieving superior robustness.

View on arXiv

@article{pham2025_2503.13429,
  title={ Escaping Plato's Cave: Robust Conceptual Reasoning through Interpretable 3D Neural Object Volumes },
  author={ Nhi Pham and Bernt Schiele and Adam Kortylewski and Jonas Fischer },
  journal={arXiv preprint arXiv:2503.13429},
  year={ 2025 }
}

Comments on this paper