ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.03685
  4. Cited By
Towards Faithfully Interpretable NLP Systems: How should we define and
  evaluate faithfulness?

Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

7 April 2020
Alon Jacovi
Yoav Goldberg
    XAI
ArXivPDFHTML

Papers citing "Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?"

50 / 381 papers shown
Title
Fixed Point Explainability
Fixed Point Explainability
Emanuele La Malfa
Jon Vadillo
Marco Molinari
Michael Wooldridge
12
0
0
18 May 2025
LAMP: Extracting Locally Linear Decision Surfaces from LLM World Models
LAMP: Extracting Locally Linear Decision Surfaces from LLM World Models
Ryan Chen
Youngmin Ko
Zeyu Zhang
Catherine Cho
Sunny Chung
Mauro Giuffré
Dennis L. Shung
Bradly C. Stadie
11
0
0
17 May 2025
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation
Yuhao Wang
Ruiyang Ren
Yucheng Wang
Wayne Xin Zhao
Jing Liu
Hua Wu
Haifeng Wang
12
0
0
17 May 2025
DSADF: Thinking Fast and Slow for Decision Making
DSADF: Thinking Fast and Slow for Decision Making
Alex Zhihao Dou
Dongfei Cui
Jun Yan
Wei Wang
Benteng Chen
Haoming Wang
Zeke Xie
Shufei Zhang
OffRL
43
1
0
13 May 2025
From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection
From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection
Moritz Vandenhirtz
Julia E. Vogt
40
0
0
09 May 2025
Reasoning Models Don't Always Say What They Think
Reasoning Models Don't Always Say What They Think
Yanda Chen
Joe Benton
Ansh Radhakrishnan
Jonathan Uesato
Carson E. Denison
...
Vlad Mikulik
Samuel R. Bowman
Jan Leike
Jared Kaplan
E. Perez
ReLM
LRM
68
15
1
08 May 2025
Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
Wei Liu
Zhongyu Niu
Lang Gao
Zhiying Deng
Jun Wang
Haozhao Wang
Ruixuan Li
209
1
0
04 May 2025
PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications
PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications
Trisanth Srinivasan
Santosh Patapati
39
0
0
03 May 2025
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Mahdi Dhaini
Ege Erdogan
Nils Feldhus
Gjergji Kasneci
51
0
0
02 May 2025
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Katie Matton
Robert Osazuwa Ness
John Guttag
Emre Kıcıman
29
2
0
19 Apr 2025
A constraints-based approach to fully interpretable neural networks for detecting learner behaviors
A constraints-based approach to fully interpretable neural networks for detecting learner behaviors
Juan D. Pinto
Luc Paquette
48
0
0
10 Apr 2025
A Meaningful Perturbation Metric for Evaluating Explainability Methods
A Meaningful Perturbation Metric for Evaluating Explainability Methods
Danielle Cohen
Hila Chefer
Lior Wolf
AAML
25
0
0
09 Apr 2025
LExT: Towards Evaluating Trustworthiness of Natural Language Explanations
LExT: Towards Evaluating Trustworthiness of Natural Language Explanations
Krithi Shailya
Shreya Rajpal
Gokul S Krishnan
Balaraman Ravindran
ELM
59
1
0
08 Apr 2025
Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations
Truthful or Fabricated? Using Causal Attribution to Mitigate Reward Hacking in Explanations
Pedro Ferreira
Wilker Aziz
Ivan Titov
LRM
31
0
0
07 Apr 2025
CFIRE: A General Method for Combining Local Explanations
CFIRE: A General Method for Combining Local Explanations
Sebastian Müller
Vanessa Toborek
Tamás Horváth
Christian Bauckhage
FAtt
53
0
0
01 Apr 2025
LLMs for Explainable AI: A Comprehensive Survey
LLMs for Explainable AI: A Comprehensive Survey
Ahsan Bilal
David Ebert
Beiyu Lin
72
1
0
31 Mar 2025
On Explaining (Large) Language Models For Code Using Global Code-Based Explanations
On Explaining (Large) Language Models For Code Using Global Code-Based Explanations
David Nader-Palacio
Dipin Khati
Daniel Rodríguez-Cárdenas
Alejandro Velasco
Denys Poshyvanyk
LRM
47
0
0
21 Mar 2025
Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance
Faithfulness of LLM Self-Explanations for Commonsense Tasks: Larger Is Better, and Instruction-Tuning Allows Trade-Offs but Not Pareto Dominance
Noah Y. Siegel
N. Heess
Maria Perez-Ortiz
Oana-Maria Camburu
LRM
54
0
0
17 Mar 2025
Reasoning-Grounded Natural Language Explanations for Language Models
Vojtech Cahlik
Rodrigo Alves
Pavel Kordík
LRM
59
1
0
14 Mar 2025
Combining Causal Models for More Accurate Abstractions of Neural Networks
Theodora-Mara Pîslar
Sara Magliacane
Atticus Geiger
AI4CE
52
0
0
14 Mar 2025
Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation
Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation
Bowen Baker
Joost Huizinga
Leo Gao
Zehao Dou
M. Guan
Aleksander Mądry
Wojciech Zaremba
J. Pachocki
David Farhi
LRM
77
13
0
14 Mar 2025
Cross-Examiner: Evaluating Consistency of Large Language Model-Generated Explanations
Danielle Villa
Maria Chang
K. Murugesan
Rosario A. Uceda-Sosa
Karthikeyan N. Ramamurthy
LRM
55
0
0
11 Mar 2025
A Unified Framework with Novel Metrics for Evaluating the Effectiveness of XAI Techniques in LLMs
A Unified Framework with Novel Metrics for Evaluating the Effectiveness of XAI Techniques in LLMs
Melkamu Mersha
Mesay Gemeda Yigezu
Hassan Shakil
Ali Al shami
SangHyun Byun
Jugal Kalita
62
0
0
06 Mar 2025
A Causal Lens for Evaluating Faithfulness Metrics
A Causal Lens for Evaluating Faithfulness Metrics
Kerem Zaman
Shashank Srivastava
73
0
0
26 Feb 2025
Can LLMs Explain Themselves Counterfactually?
Can LLMs Explain Themselves Counterfactually?
Zahra Dehghanighobadi
Asja Fischer
Muhammad Bilal Zafar
LRM
47
0
0
25 Feb 2025
Comparing zero-shot self-explanations with human rationales in text classification
Comparing zero-shot self-explanations with human rationales in text classification
Stephanie Brandl
Oliver Eberle
65
0
0
24 Feb 2025
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking
Beyond Translation: LLM-Based Data Generation for Multilingual Fact-Checking
Yi-Ling Chung
Aurora Cobo
Pablo Serna
SyDa
HILM
68
0
0
24 Feb 2025
A Survey of Model Architectures in Information Retrieval
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
73
2
0
21 Feb 2025
A Close Look at Decomposition-based XAI-Methods for Transformer Language Models
A Close Look at Decomposition-based XAI-Methods for Transformer Language Models
L. Arras
Bruno Puri
Patrick Kahardipraja
Sebastian Lapuschkin
Wojciech Samek
46
1
0
21 Feb 2025
Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking
Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking
Greta Warren
Irina Shklovski
Isabelle Augenstein
OffRL
80
4
0
13 Feb 2025
Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies
Fostering Appropriate Reliance on Large Language Models: The Role of Explanations, Sources, and Inconsistencies
Sunnie S. Y. Kim
J. Vaughan
Q. V. Liao
Tania Lombrozo
Olga Russakovsky
109
5
0
12 Feb 2025
Is Conversational XAI All You Need? Human-AI Decision Making With a Conversational XAI Assistant
Is Conversational XAI All You Need? Human-AI Decision Making With a Conversational XAI Assistant
Gaole He
Nilay Aishwarya
U. Gadiraju
46
6
0
29 Jan 2025
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference
A Study of the Plausibility of Attention between RNN Encoders in Natural Language Inference
Duc Hau Nguyen
Duc Hau Nguyen
Pascale Sébillot
60
5
0
23 Jan 2025
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Duc Hau Nguyen
Cyrielle Mallart
Guillaume Gravier
Pascale Sébillot
65
0
0
22 Jan 2025
ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability
ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability
Antonin Poché
Alon Jacovi
Agustin Picard
Victor Boutin
Fanny Jourdan
42
2
0
10 Jan 2025
Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts
Codebook LLMs: Evaluating LLMs as Measurement Tools for Political Science Concepts
Andrew Halterman
Katherine A. Keith
52
2
0
10 Jan 2025
Explainable Time Series Prediction of Tyre Energy in Formula One Race Strategy
Explainable Time Series Prediction of Tyre Energy in Formula One Race Strategy
Jamie Todd
Junqi Jiang
Aaron Russo
Steffen Winkler
Stuart Sale
Joseph McMillan
Antonio Rago
AI4TS
38
0
0
07 Jan 2025
Boosting Explainability through Selective Rationalization in Pre-trained Language Models
Libing Yuan
Shuaibo Hu
Kui Yu
Le Wu
LRM
33
0
0
03 Jan 2025
Reconciling Privacy and Explainability in High-Stakes: A Systematic Inquiry
Reconciling Privacy and Explainability in High-Stakes: A Systematic Inquiry
Supriya Manna
Niladri Sett
171
0
0
30 Dec 2024
Can Highlighting Help GitHub Maintainers Track Security Fixes?
Xueqing Liu
Yuchen Xiong
Qiushi Liu
Jiangrui Zheng
72
0
0
18 Nov 2024
Explanations that reveal all through the definition of encoding
Explanations that reveal all through the definition of encoding
A. Puli
Nhi Nguyen
Rajesh Ranganath
FAtt
XAI
43
1
0
04 Nov 2024
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Dennis Fucci
Marco Gaido
Beatrice Savoldi
Matteo Negri
Mauro Cettolo
L. Bentivogli
57
1
0
03 Nov 2024
Causal Abstraction in Model Interpretability: A Compact Survey
Causal Abstraction in Model Interpretability: A Compact Survey
Yihao Zhang
38
0
0
26 Oct 2024
Evaluating the Influences of Explanation Style on Human-AI Reliance
Evaluating the Influences of Explanation Style on Human-AI Reliance
Emma Casolin
Flora D. Salim
Ben Newell
23
1
0
26 Oct 2024
On Explaining with Attention Matrices
On Explaining with Attention Matrices
Omar Naim
Nicholas Asher
34
1
0
24 Oct 2024
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang
Prasanna Parthasarathi
Mehdi Rezagholizadeh
Boxing Chen
Sarath Chandar
53
0
0
22 Oct 2024
XForecast: Evaluating Natural Language Explanations for Time Series
  Forecasting
XForecast: Evaluating Natural Language Explanations for Time Series Forecasting
Taha Aksu
Chenghao Liu
Amrita Saha
Sarah Tan
Caiming Xiong
Doyen Sahoo
AI4TS
31
1
0
18 Oct 2024
Towards Faithful Natural Language Explanations: A Study Using Activation
  Patching in Large Language Models
Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models
Wei Jie Yeo
Ranjan Satapathy
Erik Cambria
34
0
0
18 Oct 2024
Hypothesis Testing the Circuit Hypothesis in LLMs
Hypothesis Testing the Circuit Hypothesis in LLMs
Claudia Shi
Nicolas Beltran-Velez
Achille Nazaret
Carolina Zheng
Adrià Garriga-Alonso
Andrew Jesson
Maggie Makar
David M. Blei
45
7
0
16 Oct 2024
Fool Me Once? Contrasting Textual and Visual Explanations in a Clinical
  Decision-Support Setting
Fool Me Once? Contrasting Textual and Visual Explanations in a Clinical Decision-Support Setting
Maxime Kayser
Bayar I. Menzat
Cornelius Emde
Bogdan Bercean
Alex Novak
Abdala Espinosa
B. Papież
Susanne Gaube
Thomas Lukasiewicz
Oana-Maria Camburu
32
1
0
16 Oct 2024
12345678
Next