v1v2 (latest)

Interpreting Deep Learning Models in Natural Language Processing: A Review

20 October 2021

Diyi Yang

Jiwei Li

Papers citing "Interpreting Deep Learning Models in Natural Language Processing: A Review"

50 / 171 papers shown

Title
e-SNLI: Natural Language Inference with Natural Language Explanations Oana-Maria Camburu Tim Rocktaschel Thomas Lukasiewicz Phil Blunsom LRM 421 640 0 04 Dec 2018
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection Vivian Lai Chenhao Tan 78 377 0 19 Nov 2018
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge Alon Talmor Jonathan Herzig Nicholas Lourie Jonathan Berant RALM 144 1,747 0 02 Nov 2018
Towards Explainable NLP: A Generative Explanation Framework for Text Classification Hui Liu Qingyu Yin William Yang Wang 102 148 0 01 Nov 2018
What can AI do for me: Evaluating Machine Learning Interpretations in Cooperative Play Shi Feng Jordan L. Boyd-Graber HAI 57 129 0 23 Oct 2018
Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks Songlin Yang Shawn Tan Alessandro Sordoni Aaron Courville 122 324 0 22 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,175 0 11 Oct 2018
Interpreting Neural Networks With Nearest Neighbors Eric Wallace Shi Feng Jordan L. Boyd-Graber AAML FAtt MILM 110 54 0 08 Sep 2018
Generating More Interesting Responses in Neural Conversation Models with Distributional Constraints Ashutosh Baheti Alan Ritter Jiwei Li W. Dolan 86 91 0 04 Sep 2018
Learning Gender-Neutral Word Embeddings Jieyu Zhao Yichao Zhou Zeyu Li Wei Wang Kai-Wei Chang FaML 103 415 0 29 Aug 2018
Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference Reza Ghaeini Xiaoli Z. Fern Prasad Tadepalli MILM 70 97 0 12 Aug 2018
Towards Robust Interpretability with Self-Explaining Neural Networks David Alvarez-Melis Tommi Jaakkola MILM XAI 128 947 0 20 Jun 2018
Hierarchical interpretations for neural network predictions Chandan Singh W. James Murdoch Bin Yu 68 146 0 14 Jun 2018
Joint Embedding of Words and Labels for Text Classification Guoyin Wang Chunyuan Li Wenlin Wang Yizhe Zhang Dinghan Shen Xinyuan Zhang Ricardo Henao Lawrence Carin AI4TS VLM 49 392 0 10 May 2018
Interpretable Adversarial Perturbation in Input Embedding Space for Text Motoki Sato Jun Suzuki Hiroyuki Shindo Yuji Matsumoto 55 192 0 08 May 2018
Chinese NER Using Lattice LSTM Yue Zhang Jie Yang 88 677 0 05 May 2018
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 351 896 0 03 May 2018
Pathologies of Neural Models Make Interpretations Difficult Shi Feng Eric Wallace Alvin Grissom II Mohit Iyyer Pedro Rodriguez Jordan L. Boyd-Graber AAML FAtt 82 321 0 20 Apr 2018
The Geometry of Culture: Analyzing Meaning through Word Embeddings Austin C. Kozlowski Matt Taddy James A. Evans 58 390 0 25 Mar 2018
Learning to Explain: An Information-Theoretic Perspective on Model Interpretation Jianbo Chen Le Song Martin J. Wainwright Michael I. Jordan MLT FAtt 153 575 0 21 Feb 2018
Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs W. James Murdoch Peter J. Liu Bin Yu 80 210 0 16 Jan 2018
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) Been Kim Martin Wattenberg Justin Gilmer Carrie J. Cai James Wexler F. Viégas Rory Sayres FAtt 227 1,850 0 30 Nov 2017
Embedding Words as Distributions with a Bayesian Skip-gram Model Arthur Brazinskas Serhii Havrylov Ivan Titov BDL 41 37 0 29 Nov 2017
Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure Dieuwke Hupkes Sara Veldhoen Willem H. Zuidema 76 280 0 28 Nov 2017
SPINE: SParse Interpretable Neural Embeddings Anant Subramanian Danish Pruthi Harsh Jhamtani Taylor Berg-Kirkpatrick Eduard H. Hovy 37 132 0 23 Nov 2017
Neural Language Modeling by Jointly Learning Syntax and Lexicon Songlin Yang Zhouhan Lin Chin-Wei Huang Aaron Courville 60 178 0 02 Nov 2017
Semantic Structure and Interpretability of Word Embeddings Lutfi Kerem Senel Ihsan Utlu Veysel Yücesoy Aykut Koç Tolga Çukur 57 106 0 01 Nov 2017
Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions Oscar Li Hao Liu Chaofan Chen Cynthia Rudin 178 592 0 13 Oct 2017
What does Attention in Neural Machine Translation Pay Attention to? Hamidreza Ghader Christof Monz 58 104 0 09 Oct 2017
Explanation in Artificial Intelligence: Insights from the Social Sciences Tim Miller XAI 250 4,273 0 22 Jun 2017
Explaining Recurrent Neural Network Predictions in Sentiment Analysis L. Arras G. Montavon K. Müller Wojciech Samek FAtt 66 354 0 22 Jun 2017
SmoothGrad: removing noise by adding noise D. Smilkov Nikhil Thorat Been Kim F. Viégas Martin Wattenberg FAtt ODL 207 2,235 0 12 Jun 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 783 132,363 0 12 Jun 2017
Learning how to explain neural networks: PatternNet and PatternAttribution Pieter-Jan Kindermans Kristof T. Schütt Maximilian Alber K. Müller D. Erhan Been Kim Sven Dähne XAI FAtt 76 340 0 16 May 2017
Learning Important Features Through Propagating Activation Differences Avanti Shrikumar Peyton Greenside A. Kundaje FAtt 203 3,881 0 10 Apr 2017
Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF Yan Shao Christian Hardmeier Jörg Tiedemann Joakim Nivre 74 106 0 05 Apr 2017
Understanding Black-box Predictions via Influence Functions Pang Wei Koh Percy Liang TDI 216 2,905 0 14 Mar 2017
Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations A. Ross M. C. Hughes Finale Doshi-Velez FAtt 131 591 0 10 Mar 2017
A Structured Self-attentive Sentence Embedding Zhouhan Lin Minwei Feng Cicero Nogueira dos Santos Mo Yu Bing Xiang Bowen Zhou Yoshua Bengio 115 2,141 0 09 Mar 2017
Axiomatic Attribution for Deep Networks Mukund Sundararajan Ankur Taly Qiqi Yan OOD FAtt 193 6,018 0 04 Mar 2017
Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning Werner Zellinger Thomas Grubinger E. Lughofer T. Natschläger Susanne Saminger-Platz OOD 105 578 0 28 Feb 2017
Understanding Neural Networks through Representation Erasure Jiwei Li Will Monroe Dan Jurafsky AAML MILM 95 567 0 24 Dec 2016
"What is Relevant in a Text Document?": An Interpretable Machine Learning Approach L. Arras F. Horn G. Montavon K. Müller Wojciech Samek 69 288 0 23 Dec 2016
Categorical Reparameterization with Gumbel-Softmax Eric Jang S. Gu Ben Poole BDL 354 5,379 0 03 Nov 2016
Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks Yossi Adi Einat Kermany Yonatan Belinkov Ofer Lavi Yoav Goldberg 79 546 0 15 Aug 2016
LSTMVis: A Tool for Visual Analysis of Hidden State Dynamics in Recurrent Neural Networks Hendrik Strobelt Sebastian Gehrmann Hanspeter Pfister Alexander M. Rush HAI 65 83 0 23 Jun 2016
Explaining Predictions of Non-Linear Classifiers in NLP L. Arras F. Horn G. Montavon K. Müller Wojciech Samek FAtt 76 117 0 23 Jun 2016
Rationalizing Neural Predictions Tao Lei Regina Barzilay Tommi Jaakkola 125 812 0 13 Jun 2016
A Fast Unified Model for Parsing and Sentence Understanding Samuel R. Bowman Jon Gauthier Abhinav Rastogi Raghav Gupta Christopher D. Manning Christopher Potts 53 314 0 19 Mar 2016
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF Xuezhe Ma Eduard H. Hovy 111 2,655 0 04 Mar 2016