ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.12421
2
0

Fixed Point Explainability

18 May 2025
Emanuele La Malfa
Jon Vadillo
Marco Molinari
Michael Wooldridge
ArXivPDFHTML
Abstract

This paper introduces a formal notion of fixed point explanations, inspired by the "why regress" principle, to assess, through recursive applications, the stability of the interplay between a model and its explainer. Fixed point explanations satisfy properties like minimality, stability, and faithfulness, revealing hidden model behaviours and explanatory weaknesses. We define convergence conditions for several classes of explainers, from feature-based to mechanistic tools like Sparse AutoEncoders, and we report quantitative and qualitative results.

View on arXiv
@article{malfa2025_2505.12421,
  title={ Fixed Point Explainability },
  author={ Emanuele La Malfa and Jon Vadillo and Marco Molinari and Michael Wooldridge },
  journal={arXiv preprint arXiv:2505.12421},
  year={ 2025 }
}
Comments on this paper