A Comparative Study of Faithfulness Metrics for Model Interpretability Methods

12 April 2022

Papers citing "A Comparative Study of Faithfulness Metrics for Model Interpretability Methods"

11 / 11 papers shown

Title
Beyond Patches: Mining Interpretable Part-Prototypes for Explainable AI Mahdi Alehdaghi Rajarshi Bhattacharya Pourya Shamsolmoali Rafael M. O. Cruz Maguelonne Heritier Eric Granger 41 0 0 16 Apr 2025
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning John Wu David Wu Jimeng Sun 52 1 0 31 Oct 2024
DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction John Wu David Wu Jimeng Sun 165 0 0 16 Sep 2024
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models Sepehr Kamahi Yadollah Yaghoobzadeh 53 0 0 21 Aug 2024
ALMANACS: A Simulatability Benchmark for Language Model Explainability Edmund Mills Shiye Su Stuart J. Russell Scott Emmons 56 7 0 20 Dec 2023
Truthful Meta-Explanations for Local Interpretability of Machine Learning Models Ioannis Mollas Nick Bassiliades Grigorios Tsoumakas 18 3 0 07 Dec 2022
The Solvability of Interpretability Evaluation Metrics Yilun Zhou J. Shah 76 8 0 18 May 2022
Measuring the Mixing of Contextual Information in the Transformer Javier Ferrando Gerard I. Gállego Marta R. Costa-jussá 31 50 0 08 Mar 2022
Local Interpretations for Explainable Natural Language Processing: A Survey Siwen Luo Hamish Ivison S. Han Josiah Poon MILM 40 48 0 20 Mar 2021
Towards A Rigorous Science of Interpretable Machine Learning Finale Doshi-Velez Been Kim XAI FaML 257 3,690 0 28 Feb 2017
Convolutional Neural Networks for Sentence Classification Yoon Kim AILaw VLM 273 13,368 0 25 Aug 2014