ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.03685
  4. Cited By
Towards Faithfully Interpretable NLP Systems: How should we define and
  evaluate faithfulness?

Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

7 April 2020
Alon Jacovi
Yoav Goldberg
    XAI
ArXivPDFHTML

Papers citing "Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?"

50 / 381 papers shown
Title
Does Faithfulness Conflict with Plausibility? An Empirical Study in
  Explainable AI across NLP Tasks
Does Faithfulness Conflict with Plausibility? An Empirical Study in Explainable AI across NLP Tasks
Xiaolei Lu
Jianghong Ma
28
0
0
29 Mar 2024
Towards a Framework for Evaluating Explanations in Automated Fact
  Verification
Towards a Framework for Evaluating Explanations in Automated Fact Verification
Neema Kotonya
Francesca Toni
37
5
0
29 Mar 2024
The Role of Syntactic Span Preferences in Post-Hoc Explanation
  Disagreement
The Role of Syntactic Span Preferences in Post-Hoc Explanation Disagreement
Jonathan Kamp
Lisa Beinborn
Antske Fokkens
30
1
0
28 Mar 2024
RankingSHAP -- Listwise Feature Attribution Explanations for Ranking Models
RankingSHAP -- Listwise Feature Attribution Explanations for Ranking Models
Maria Heuss
Maarten de Rijke
Avishek Anand
166
1
0
24 Mar 2024
Visual Analytics for Fine-grained Text Classification Models and
  Datasets
Visual Analytics for Fine-grained Text Classification Models and Datasets
Munkhtulga Battogtokh
Y. Xing
Cosmin Davidescu
Alfie Abdul-Rahman
Michael Luck
Rita Borgo
31
0
0
21 Mar 2024
Clinical information extraction for Low-resource languages with Few-shot
  learning using Pre-trained language models and Prompting
Clinical information extraction for Low-resource languages with Few-shot learning using Pre-trained language models and Prompting
Phillip Richter-Pechanski
Philipp Wiesenbach
Dominic M. Schwab
Christina Kiriakou
Nicolas Geis
Christoph Dieterich
Anette Frank
37
4
0
20 Mar 2024
Comparing Explanation Faithfulness between Multilingual and Monolingual
  Fine-tuned Language Models
Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models
Zhixue Zhao
Nikolaos Aletras
37
3
0
19 Mar 2024
Demystifying the Physics of Deep Reinforcement Learning-Based Autonomous
  Vehicle Decision-Making
Demystifying the Physics of Deep Reinforcement Learning-Based Autonomous Vehicle Decision-Making
Hanxi Wan
Pei Li
Arpan Kusari
AI4CE
37
0
0
18 Mar 2024
A Question on the Explainability of Large Language Models and the
  Word-Level Univariate First-Order Plausibility Assumption
A Question on the Explainability of Large Language Models and the Word-Level Univariate First-Order Plausibility Assumption
Jérémie Bogaert
François-Xavier Standaert
AAML
LRM
21
2
0
15 Mar 2024
Bias-Augmented Consistency Training Reduces Biased Reasoning in
  Chain-of-Thought
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
James Chua
Edward Rees
Hunar Batra
Samuel R. Bowman
Julian Michael
Ethan Perez
Miles Turpin
LRM
47
13
0
08 Mar 2024
Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach
  for Relation Classification
Best of Both Worlds: A Pliable and Generalizable Neuro-Symbolic Approach for Relation Classification
Robert Vacareanu
F. Alam
M. Islam
Haris Riaz
Mihai Surdeanu
NAI
35
2
0
05 Mar 2024
Chain-of-Thought Unfaithfulness as Disguised Accuracy
Chain-of-Thought Unfaithfulness as Disguised Accuracy
Oliver Bentham
Nathan Stringham
Ana Marasović
LRM
HILM
50
8
0
22 Feb 2024
Making Reasoning Matter: Measuring and Improving Faithfulness of
  Chain-of-Thought Reasoning
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning
Debjit Paul
Robert West
Antoine Bosselut
Boi Faltings
ReLM
LRM
41
21
0
21 Feb 2024
Comparing Inferential Strategies of Humans and Large Language Models in
  Deductive Reasoning
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
Philipp Mondorf
Barbara Plank
LRM
40
9
0
20 Feb 2024
How Interpretable are Reasoning Explanations from Prompting Large
  Language Models?
How Interpretable are Reasoning Explanations from Prompting Large Language Models?
Yeo Wei Jie
Ranjan Satapathy
Rick Mong
Min Zhang
ReLM
LRM
57
17
0
19 Feb 2024
CliqueParcel: An Approach For Batching LLM Prompts That Jointly
  Optimizes Efficiency And Faithfulness
CliqueParcel: An Approach For Batching LLM Prompts That Jointly Optimizes Efficiency And Faithfulness
Jiayi Liu
Tinghan Yang
Jennifer Neville
26
10
0
17 Feb 2024
Properties and Challenges of LLM-Generated Explanations
Properties and Challenges of LLM-Generated Explanations
Jenny Kunz
Marco Kuhlmann
35
20
0
16 Feb 2024
Plausible Extractive Rationalization through Semi-Supervised Entailment
  Signal
Plausible Extractive Rationalization through Semi-Supervised Entailment Signal
Yeo Wei Jie
Ranjan Satapathy
Min Zhang
19
5
0
13 Feb 2024
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind
  Reasoning Capabilities of Large Language Models
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
Hainiu Xu
Runcong Zhao
Lixing Zhu
Bin Liang
Yulan He
84
21
0
08 Feb 2024
Advancing Explainable AI Toward Human-Like Intelligence: Forging the
  Path to Artificial Brain
Advancing Explainable AI Toward Human-Like Intelligence: Forging the Path to Artificial Brain
Yongchen Zhou
Richard Jiang
24
3
0
07 Feb 2024
A Hypothesis-Driven Framework for the Analysis of Self-Rationalising
  Models
A Hypothesis-Driven Framework for the Analysis of Self-Rationalising Models
Marc Braun
Jenny Kunz
18
2
0
07 Feb 2024
The Future of Cognitive Strategy-enhanced Persuasive Dialogue Agents:
  New Perspectives and Trends
The Future of Cognitive Strategy-enhanced Persuasive Dialogue Agents: New Perspectives and Trends
Mengqi Chen
Bin Guo
Hao Wang
Haoyu Li
Qian Zhao
Jingqi Liu
Yasan Ding
Yan Pan
Zhiwen Yu
LLMAG
38
1
0
07 Feb 2024
Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations
  from Large Language Models
Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models
Chirag Agarwal
Sree Harsha Tanneru
Himabindu Lakkaraju
LRM
45
36
0
07 Feb 2024
ReAGent: A Model-agnostic Feature Attribution Method for Generative
  Language Models
ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models
Zhixue Zhao
Boxuan Shan
31
5
0
01 Feb 2024
Towards Consistent Natural-Language Explanations via
  Explanation-Consistency Finetuning
Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning
Yanda Chen
Chandan Singh
Xiaodong Liu
Simiao Zuo
Bin-Xia Yu
He He
Jianfeng Gao
LRM
27
13
0
25 Jan 2024
Generating Zero-shot Abstractive Explanations for Rumour Verification
Generating Zero-shot Abstractive Explanations for Rumour Verification
I. Bilal
Preslav Nakov
Rob Procter
M. Liakata
24
0
0
23 Jan 2024
B-Cos Aligned Transformers Learn Human-Interpretable Features
B-Cos Aligned Transformers Learn Human-Interpretable Features
Manuel Tran
Amal Lahiani
Yashin Dicente Cid
Melanie Boxberg
Peter Lienemann
C. Matek
S. J. Wagner
Fabian J. Theis
Eldad Klaiman
Tingying Peng
MedIm
ViT
21
2
0
16 Jan 2024
Are self-explanations from Large Language Models faithful?
Are self-explanations from Large Language Models faithful?
Andreas Madsen
Sarath Chandar
Siva Reddy
LRM
30
25
0
15 Jan 2024
Evaluating Language Model Agency through Negotiations
Evaluating Language Model Agency through Negotiations
Tim R. Davidson
V. Veselovsky
Martin Josifoski
Maxime Peyrard
Antoine Bosselut
Michal Kosinski
Robert West
LLMAG
34
22
0
09 Jan 2024
Towards Faithful Explanations for Text Classification with Robustness
  Improvement and Explanation Guided Training
Towards Faithful Explanations for Text Classification with Robustness Improvement and Explanation Guided Training
Dongfang Li
Baotian Hu
Qingcai Chen
Shan He
34
4
0
29 Dec 2023
Don't Believe Everything You Read: Enhancing Summarization
  Interpretability through Automatic Identification of Hallucinations in Large
  Language Models
Don't Believe Everything You Read: Enhancing Summarization Interpretability through Automatic Identification of Hallucinations in Large Language Models
Priyesh Vakharia
Devavrat Joshi
Meenal Chavan
Dhananjay Sonawane
Bhrigu Garg
Parsa Mazaheri
HILM
43
1
0
22 Dec 2023
ALMANACS: A Simulatability Benchmark for Language Model Explainability
ALMANACS: A Simulatability Benchmark for Language Model Explainability
Edmund Mills
Shiye Su
Stuart J. Russell
Scott Emmons
56
7
0
20 Dec 2023
The Problem of Coherence in Natural Language Explanations of Recommendations
The Problem of Coherence in Natural Language Explanations of Recommendations
Jakub Raczynski
Mateusz Lango
Jerzy Stefanowski
37
6
0
18 Dec 2023
Explain To Decide: A Human-Centric Review on the Role of Explainable
  Artificial Intelligence in AI-assisted Decision Making
Explain To Decide: A Human-Centric Review on the Role of Explainable Artificial Intelligence in AI-assisted Decision Making
Milad Rogha
41
0
0
11 Dec 2023
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
Giovanni Monea
Maxime Peyrard
Martin Josifoski
Vishrav Chaudhary
Jason Eisner
Emre Kiciman
Hamid Palangi
Barun Patra
Robert West
KELM
56
12
0
04 Dec 2023
Japanese Tort-case Dataset for Rationale-supported Legal Judgment
  Prediction
Japanese Tort-case Dataset for Rationale-supported Legal Judgment Prediction
Hiroaki Yamada
Takenobu Tokunaga
Ryutaro Ohara
Akira Tokutsu
Keisuke Takeshita
Mihoko Sumida
ELM
AILaw
30
1
0
01 Dec 2023
Conceptual Engineering Using Large Language Models
Conceptual Engineering Using Large Language Models
Bradley Paul Allen
37
0
0
01 Dec 2023
Improving Interpretation Faithfulness for Vision Transformers
Improving Interpretation Faithfulness for Vision Transformers
Lijie Hu
Yixin Liu
Ninghao Liu
Mengdi Huai
Lichao Sun
Di Wang
41
5
0
29 Nov 2023
What if you said that differently?: How Explanation Formats Affect Human
  Feedback Efficacy and User Perception
What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception
Chaitanya Malaviya
Subin Lee
Dan Roth
Mark Yatskar
37
1
0
16 Nov 2023
On Measuring Faithfulness or Self-consistency of Natural Language
  Explanations
On Measuring Faithfulness or Self-consistency of Natural Language Explanations
Letitia Parcalabescu
Anette Frank
LRM
74
22
0
13 Nov 2023
Large Language Models are In-context Teachers for Knowledge Reasoning
Large Language Models are In-context Teachers for Knowledge Reasoning
Jiachen Zhao
Zonghai Yao
Zhichao Yang
Hong-ye Yu
ReLM
LRM
32
1
0
12 Nov 2023
A Survey of Large Language Models Attribution
A Survey of Large Language Models Attribution
Dongfang Li
Zetian Sun
Xinshuo Hu
Zhenyu Liu
Ziyang Chen
Baotian Hu
Aiguo Wu
Min Zhang
HILM
21
49
0
07 Nov 2023
Quantifying Uncertainty in Natural Language Explanations of Large
  Language Models
Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Sree Harsha Tanneru
Chirag Agarwal
Himabindu Lakkaraju
LRM
27
14
0
06 Nov 2023
Proto-lm: A Prototypical Network-Based Framework for Built-in
  Interpretability in Large Language Models
Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models
Sean Xie
Soroush Vosoughi
Saeed Hassanpour
49
3
0
03 Nov 2023
Interpretable-by-Design Text Understanding with Iteratively Generated
  Concept Bottleneck
Interpretable-by-Design Text Understanding with Iteratively Generated Concept Bottleneck
Josh Magnus Ludan
Qing Lyu
Yue Yang
Liam Dugan
Mark Yatskar
Chris Callison-Burch
35
4
0
30 Oct 2023
On the Interplay between Fairness and Explainability
On the Interplay between Fairness and Explainability
Stephanie Brandl
Emanuele Bugliarello
Ilias Chalkidis
FaML
27
4
0
25 Oct 2023
K-HATERS: A Hate Speech Detection Corpus in Korean with Target-Specific
  Ratings
K-HATERS: A Hate Speech Detection Corpus in Korean with Target-Specific Ratings
Chaewon Park
Soohwan Kim
Kyubyong Park
Kunwoo Park
27
4
0
24 Oct 2023
Cross-Modal Conceptualization in Bottleneck Models
Cross-Modal Conceptualization in Bottleneck Models
Danis Alukaev
S. Kiselev
Ilya S. Pershin
Bulat Ibragimov
Vladimir Ivanov
Alexey Kornaev
Ivan Titov
41
7
0
23 Oct 2023
REFER: An End-to-end Rationale Extraction Framework for Explanation
  Regularization
REFER: An End-to-end Rationale Extraction Framework for Explanation Regularization
Mohammad Reza Ghasemi Madani
Pasquale Minervini
35
4
0
22 Oct 2023
QA-NatVer: Question Answering for Natural Logic-based Fact Verification
QA-NatVer: Question Answering for Natural Logic-based Fact Verification
Rami Aly
Marek Strong
Andreas Vlachos
35
6
0
22 Oct 2023
Previous
12345678
Next