Pixel-level Certified Explanations via Randomized Smoothing

18 June 2025

Alaa Anani

Tobias Lorenz

Mario Fritz

Bernt Schiele

Author Contacts:

aanani@mpi-inf.mpg.de

FAtt

AAML

ArXiv (abs)PDF HTML

Main:10 Pages

24 Figures

Bibliography:3 Pages

Appendix:16 Pages

Abstract

Post-hoc attribution methods aim to explain deep learning predictions by highlighting influential input pixels. However, these explanations are highly non-robust: small, imperceptible input perturbations can drastically alter the attribution map while maintaining the same prediction. This vulnerability undermines their trustworthiness and calls for rigorous robustness guarantees of pixel-level attribution scores. We introduce the first certification framework that guarantees pixel-level robustness for any black-box attribution method using randomized smoothing. By sparsifying and smoothing attribution maps, we reformulate the task as a segmentation problem and certify each pixel's importance against $\ell_2$ -bounded perturbations. We further propose three evaluation metrics to assess certified robustness, localization, and faithfulness. An extensive evaluation of 12 attribution methods across 5 ImageNet models shows that our certified attributions are robust, interpretable, and faithful, enabling reliable use in downstream tasks. Our code is atthis https URL.

View on arXiv

@article{anani2025_2506.15499,
  title={ Pixel-level Certified Explanations via Randomized Smoothing },
  author={ Alaa Anani and Tobias Lorenz and Mario Fritz and Bernt Schiele },
  journal={arXiv preprint arXiv:2506.15499},
  year={ 2025 }
}

Comments on this paper