v1v2v3 (latest)

Explanation-Based Human Debugging of NLP Models: A Survey

30 April 2021

Piyawat Lertvittayakumjorn

Francesca Toni

LRM

ArXiv (abs)PDF HTML

Papers citing "Explanation-Based Human Debugging of NLP Models: A Survey"

50 / 54 papers shown

Title
Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers Lam Nguyen Tung Steven Cho Xiaoning Du Neelofar Neelofar Valerio Terragni Stefano Ruberto Aldeida Aleti 507 2 0 30 Oct 2024
ALMANACS: A Simulatability Benchmark for Language Model Explainability Edmund Mills Shiye Su Stuart J. Russell Scott Emmons 145 9 0 20 Dec 2023
Towards Benchmarking the Utility of Explanations for Model Debugging Maximilian Idahl Lijun Lyu U. Gadiraju Avishek Anand XAI 54 18 0 10 May 2021
Refining Language Models with Compositional Explanations Huihan Yao Ying Chen Qinyuan Ye Xisen Jin Xiang Ren 59 36 0 18 Mar 2021
Putting Humans in the Natural Language Processing Loop: A Survey Zijie J. Wang Dongjin Choi Shenyu Xu Diyi Yang LM&MA 77 74 0 06 Mar 2021
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging Han Guo Nazneen Rajani Peter Hase Joey Tianyi Zhou Caiming Xiong TDI 121 116 0 31 Dec 2020
Debugging Tests for Model Explanations Julius Adebayo M. Muelly Ilaria Liccardi Been Kim FAtt 76 181 0 10 Nov 2020
Model-Agnostic Explanations using Minimal Forcing Subsets Xing Han Joydeep Ghosh AAML 30 4 0 01 Nov 2020
Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI Alon Jacovi Ana Marasović Tim Miller Yoav Goldberg 312 447 0 15 Oct 2020
FIND: Human-in-the-Loop Debugging Deep Text Classifiers Piyawat Lertvittayakumjorn Lucia Specia Francesca Toni 45 54 0 10 Oct 2020
Machine Guides, Human Supervises: Interactive Learning with Global Explanations Teodora Popordanoska Mohit Kumar Stefano Teso 104 21 0 21 Sep 2020
Soliciting Human-in-the-Loop User Feedback for Interactive Machine Learning Reduces User Trust and Impressions of Model Accuracy Donald R. Honeycutt Mahsan Nourani Eric D. Ragan HAI 80 63 0 28 Aug 2020
The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models Ian Tenney James Wexler Jasmijn Bastings Tolga Bolukbasi Andy Coenen ... Ellen Jiang Mahima Pushkarna Carey Radebaugh Emily Reif Ann Yuan VLM 124 196 0 12 Aug 2020
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions Xiaochuang Han Byron C. Wallace Yulia Tsvetkov MILM FAtt AAML TDI 79 174 0 14 May 2020
Beyond Accuracy: Behavioral Testing of NLP models with CheckList Marco Tulio Ribeiro Tongshuang Wu Carlos Guestrin Sameer Singh ELM 210 1,110 0 08 May 2020
Pre-trained Models for Natural Language Processing: A Survey Xipeng Qiu Tianxiang Sun Yige Xu Yunfan Shao Ning Dai Xuanjing Huang LM&MA VLM 377 1,489 0 18 Mar 2020
A Primer in BERTology: What we know about how BERT works Anna Rogers Olga Kovaleva Anna Rumshisky OffRL 97 1,503 0 27 Feb 2020
Debugging Machine Learning Pipelines Raoni Lourenço J. Freire D. Shasha AI4CE 75 28 0 11 Feb 2020
Making deep neural networks right for the right scientific reasons by interacting with their explanations P. Schramowski Wolfgang Stammer Stefano Teso Anna Brugger Xiaoting Shao Hans-Georg Luigs Anne-Katrin Mahlein Kristian Kersting 123 213 0 15 Jan 2020
"Why is 'Chicago' deceptive?" Towards Building Model-Driven Tutorials for Humans Vivian Lai Han Liu Chenhao Tan 84 143 0 14 Jan 2020
Effect of Confidence and Explanation on Accuracy and Trust Calibration in AI-Assisted Decision Making Yunfeng Zhang Q. V. Liao Rachel K. E. Bellamy 86 681 0 07 Jan 2020
The relationship between trust in AI and trustworthy machine learning technologies Ehsan Toreini Mhairi Aitken Kovila P. L. Coopamootoo Karen Elliott Carlos Vladimiro Gonzalez Zelaya Aad van Moorsel FaML 73 260 0 27 Nov 2019
exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models Benjamin Hoover Hendrik Strobelt Sebastian Gehrmann 35 86 0 11 Oct 2019
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge Laura Rieger Chandan Singh W. James Murdoch Bin Yu FAtt 98 215 0 30 Sep 2019
One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques Vijay Arya Rachel K. E. Bellamy Pin-Yu Chen Amit Dhurandhar Michael Hind ... Karthikeyan Shanmugam Moninder Singh Kush R. Varshney Dennis L. Wei Yunfeng Zhang XAI 72 393 0 06 Sep 2019
Human-grounded Evaluations of Explanation Methods for Text Classification Piyawat Lertvittayakumjorn Francesca Toni FAtt 79 67 0 29 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 689 24,557 0 26 Jul 2019
Interpretable and Steerable Sequence Learning via Prototypes Yao Ming Panpan Xu Huamin Qu Liu Ren AI4TS 63 141 0 23 Jul 2019
Informed Machine Learning -- A Taxonomy and Survey of Integrating Knowledge into Learning Systems Laura von Rueden S. Mayer Katharina Beckh B. Georgiev Sven Giesselbach ... Rajkumar Ramamurthy Michal Walczak Jochen Garcke Christian Bauckhage Jannis Schuecker 94 643 0 29 Mar 2019
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference R. Thomas McCoy Ellie Pavlick Tal Linzen 143 1,244 0 04 Feb 2019
Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting Maria De-Arteaga Alexey Romanov Hanna M. Wallach J. Chayes C. Borgs Alexandra Chouldechova S. Geyik K. Kenthapadi Adam Tauman Kalai 202 462 0 27 Jan 2019
e-SNLI: Natural Language Inference with Natural Language Explanations Oana-Maria Camburu Tim Rocktaschel Thomas Lukasiewicz Phil Blunsom LRM 427 641 0 04 Dec 2018
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection Vivian Lai Chenhao Tan 78 380 0 19 Nov 2018
Interpreting Black Box Predictions using Fisher Kernels Rajiv Khanna Been Kim Joydeep Ghosh Oluwasanmi Koyejo FAtt 83 104 0 23 Oct 2018
Adversarial TableQA: Attention Supervision for Question Answering on Tables Minseok Cho Reinald Kim Amplayo Seung-won Hwang Jonghyuck Park LMTD OOD 56 22 0 18 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,229 0 11 Oct 2018
Reducing Gender Bias in Abusive Language Detection Ji Ho Park Jamin Shin Pascale Fung FaML 64 341 0 22 Aug 2018
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks Mohit Iyyer John Wieting Kevin Gimpel Luke Zettlemoyer AAML GAN 349 719 0 17 Apr 2018
Annotation Artifacts in Natural Language Inference Data Suchin Gururangan Swabha Swayamdipta Omer Levy Roy Schwartz Samuel R. Bowman Noah A. Smith 161 1,180 0 06 Mar 2018
Manipulating and Measuring Model Interpretability Forough Poursabzi-Sangdeh D. Goldstein Jake M. Hofman Jennifer Wortman Vaughan Hanna M. Wallach 99 701 0 21 Feb 2018
How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation Menaka Narayanan Emily Chen Jeffrey He Been Kim S. Gershman Finale Doshi-Velez FAtt XAI 110 244 0 02 Feb 2018
Adversarial Examples for Evaluating Reading Comprehension Systems Robin Jia Percy Liang AAML ELM 211 1,609 0 23 Jul 2017
Developing Bug-Free Machine Learning Systems With Formal Mathematics Daniel Selsam Percy Liang D. Dill 37 55 0 26 Jun 2017
Explanation in Artificial Intelligence: Insights from the Social Sciences Tim Miller XAI 254 4,281 0 22 Jun 2017
SmoothGrad: removing noise by adding noise D. Smilkov Nikhil Thorat Been Kim F. Viégas Martin Wattenberg FAtt ODL 210 2,236 0 12 Jun 2017
A Unified Approach to Interpreting Model Predictions Scott M. Lundberg Su-In Lee FAtt 1.1K 22,090 0 22 May 2017
Understanding Black-box Predictions via Influence Functions Pang Wei Koh Percy Liang TDI 219 2,910 0 14 Mar 2017
Axiomatic Attribution for Deep Networks Mukund Sundararajan Ankur Taly Qiqi Yan OOD FAtt 193 6,027 0 04 Mar 2017
Bag of Tricks for Efficient Text Classification Armand Joulin Edouard Grave Piotr Bojanowski Tomas Mikolov VLM 183 4,632 0 06 Jul 2016
Explaining Predictions of Non-Linear Classifiers in NLP L. Arras F. Horn G. Montavon K. Müller Wojciech Samek FAtt 84 117 0 23 Jun 2016