Pair the Dots: Jointly Examining Training History and Test Stimuli for
Model Interpretability

Pair the Dots: Jointly Examining Training History and Test Stimuli for Model Interpretability

14 October 2020

Jiwei Li

Papers citing "Pair the Dots: Jointly Examining Training History and Test Stimuli for Model Interpretability"

11 / 11 papers shown

Title
How do languages influence each other? Studying cross-lingual data sharing during LM fine-tuning Rochelle Choenni Dan Garrette Ekaterina Shutova 40 16 0 22 May 2023
A General Framework for Defending Against Backdoor Attacks via Influence Graph Xiaofei Sun Jiwei Li Xiaoya Li Ziyao Wang Tianwei Zhang Han Qiu Fei Wu Chun Fan AAML TDI 24 5 0 29 Nov 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review Xiaofei Sun Diyi Yang Xiaoya Li Tianwei Zhang Yuxian Meng Han Qiu Guoyin Wang Eduard H. Hovy Jiwei Li 19 44 0 20 Oct 2021
On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation Wei Zhang Ziming Huang Yada Zhu Guangnan Ye Xiaodong Cui Fan Zhang 31 17 0 09 Jun 2021
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging Han Guo Nazneen Rajani Peter Hase Joey Tianyi Zhou Caiming Xiong TDI 41 102 0 31 Dec 2020
Self-Explaining Structures Improve NLP Models Zijun Sun Chun Fan Qinghong Han Xiaofei Sun Yuxian Meng Fei Wu Jiwei Li MILM XAI LRM FAtt 46 38 0 03 Dec 2020
FreeLB: Enhanced Adversarial Training for Natural Language Understanding Chen Zhu Yu Cheng Zhe Gan S. Sun Tom Goldstein Jingjing Liu AAML 232 438 0 25 Sep 2019
Certified Robustness to Adversarial Word Substitutions Robin Jia Aditi Raghunathan Kerem Göksel Percy Liang AAML 188 291 0 03 Sep 2019
Generating Natural Language Adversarial Examples M. Alzantot Yash Sharma Ahmed Elgohary Bo-Jhang Ho Mani B. Srivastava Kai-Wei Chang AAML 258 915 0 21 Apr 2018
A causal framework for explaining the predictions of black-box sequence-to-sequence models David Alvarez-Melis Tommi Jaakkola CML 232 200 0 06 Jul 2017
Adversarial examples in the physical world Alexey Kurakin Ian Goodfellow Samy Bengio SILM AAML 293 5,842 0 08 Jul 2016