To what extent do human explanations of model behavior align with actual
model behavior?

To what extent do human explanations of model behavior align with actual model behavior?

24 December 2020

Joey Tianyi Zhou

Robin Jia

Douwe Kiela

Papers citing "To what extent do human explanations of model behavior align with actual model behavior?"

8 / 8 papers shown

Title
Shortcut Learning of Large Language Models in Natural Language Understanding Mengnan Du Fengxiang He Na Zou Dacheng Tao Xia Hu KELM OffRL 42 84 0 25 Aug 2022
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining Andreas Madsen Nicholas Meade Vaibhav Adlakha Siva Reddy 111 35 0 15 Oct 2021
How Well do Feature Visualizations Support Causal Understanding of CNN Activations? Roland S. Zimmermann Judy Borowski Robert Geirhos Matthias Bethge Thomas S. A. Wallis Wieland Brendel FAtt 47 31 0 23 Jun 2021
Local Interpretations for Explainable Natural Language Processing: A Survey Siwen Luo Hamish Ivison S. Han Josiah Poon MILM 43 48 0 20 Mar 2021
ANLIzing the Adversarial Natural Language Inference Dataset Adina Williams Tristan Thrush Douwe Kiela AAML 183 46 0 24 Oct 2020
Explainable Deep Learning: A Field Guide for the Uninitiated Gabrielle Ras Ning Xie Marcel van Gerven Derek Doran AAML XAI 41 371 0 30 Apr 2020
e-SNLI: Natural Language Inference with Natural Language Explanations Oana-Maria Camburu Tim Rocktaschel Thomas Lukasiewicz Phil Blunsom LRM 287 623 0 04 Dec 2018
Hypothesis Only Baselines in Natural Language Inference Adam Poliak Jason Naradowsky Aparajita Haldar Rachel Rudinger Benjamin Van Durme 190 576 0 02 May 2018