Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1506.01066
Cited By
Visualizing and Understanding Neural Models in NLP
2 June 2015
Jiwei Li
Xinlei Chen
Eduard H. Hovy
Dan Jurafsky
MILM
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Visualizing and Understanding Neural Models in NLP"
50 / 134 papers shown
Title
ForeCite: Adapting Pre-Trained Language Models to Predict Future Citation Rates of Academic Papers
Gavin Hull
Alex Bihlo
29
0
0
13 May 2025
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang
Yifei Liu
Yingdong Shi
Chong Li
Anqi Pang
Sibei Yang
Jingyi Yu
Kan Ren
ViT
69
0
0
12 Mar 2025
Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
Jie Zeng
Qianyu He
Qingyu Ren
Jiaqing Liang
Yanghua Xiao
Weikang Zhou
Zeye Sun
Fei Yu
86
1
0
24 Feb 2025
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights
Mohamad Ballout
U. Krumnack
Gunther Heidemann
Kai-Uwe Kühnberger
35
2
0
19 Sep 2024
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models
Sepehr Kamahi
Yadollah Yaghoobzadeh
53
0
0
21 Aug 2024
Crafting Large Language Models for Enhanced Interpretability
Chung-En Sun
Tuomas P. Oikarinen
Tsui-Wei Weng
38
7
0
05 Jul 2024
Evaluating Human Alignment and Model Faithfulness of LLM Rationale
Mohsen Fayyaz
Fan Yin
Jiao Sun
Nanyun Peng
65
3
0
28 Jun 2024
Gradient based Feature Attribution in Explainable AI: A Technical Review
Yongjie Wang
Tong Zhang
Xu Guo
Zhiqi Shen
XAI
29
19
0
15 Mar 2024
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Gianluigi Lopardo
F. Precioso
Damien Garreau
18
4
0
05 Feb 2024
Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Sree Harsha Tanneru
Chirag Agarwal
Himabindu Lakkaraju
LRM
27
14
0
06 Nov 2023
Multiscale Positive-Unlabeled Detection of AI-Generated Texts
Yuchuan Tian
Hanting Chen
Xutao Wang
Zheyuan Bai
Qinghua Zhang
Ruifeng Li
Chaoxi Xu
Yunhe Wang
DeLMO
38
43
0
29 May 2023
Explaining How Transformers Use Context to Build Predictions
Javier Ferrando
Gerard I. Gállego
Ioannis Tsiamas
Marta R. Costa-jussá
32
32
0
21 May 2023
Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach
Masahiro Kaneko
Graham Neubig
Naoaki Okazaki
39
6
0
19 May 2023
Causal Analysis for Robust Interpretability of Neural Networks
Ola Ahmad
Nicolas Béreux
Loïc Baret
V. Hashemi
Freddy Lecue
CML
29
3
0
15 May 2023
Towards a Praxis for Intercultural Ethics in Explainable AI
Chinasa T. Okolo
39
3
0
24 Apr 2023
Effective Visualization and Analysis of Recommender Systems
Hao Wang
20
1
0
02 Mar 2023
Tell Model Where to Attend: Improving Interpretability of Aspect-Based Sentiment Classification via Small Explanation Annotations
Zhenxiao Cheng
Jie Zhou
Wen Wu
Qin Chen
Liang He
32
3
0
21 Feb 2023
Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
Weijia Xu
Sweta Agrawal
Eleftheria Briakou
Marianna J. Martindale
Marine Carpuat
HILM
27
47
0
18 Jan 2023
State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions
Cheng Wang
Carolin (Haas) Lawrence
Mathias Niepert
21
3
0
10 Dec 2022
AutoCAD: Automatically Generating Counterfactuals for Mitigating Shortcut Learning
Jiaxin Wen
Yeshuang Zhu
Jinchao Zhang
Jie Zhou
Minlie Huang
CML
AAML
22
8
0
29 Nov 2022
Deconfounding Legal Judgment Prediction for European Court of Human Rights Cases Towards Better Alignment with Experts
Santosh T.Y.S.S
Shanshan Xu
O. Ichim
Matthias Grabmair
37
26
0
25 Oct 2022
Precisely the Point: Adversarial Augmentations for Faithful and Informative Text Generation
Wenhao Wu
Wei Li
Jiachen Liu
Xinyan Xiao
Sujian Li
Yajuan Lyu
42
3
0
22 Oct 2022
On the Explainability of Natural Language Processing Deep Models
Julia El Zini
M. Awad
29
82
0
13 Oct 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying-Cong Chen
Xinyan Xiao
Jing Liu
Hua Wu
37
4
0
28 Jul 2022
A Unified Understanding of Deep NLP Models for Text Classification
Zhuguo Li
Xiting Wang
Weikai Yang
Jing Wu
Zhengyan Zhang
Zhiyuan Liu
Maosong Sun
Hui Zhang
Shixia Liu
VLM
28
30
0
19 Jun 2022
ORCA: Interpreting Prompted Language Models via Locating Supporting Data Evidence in the Ocean of Pretraining Data
Xiaochuang Han
Yulia Tsvetkov
24
28
0
25 May 2022
Lack of Fluency is Hurting Your Translation Model
J. Yoo
Jaewoo Kang
23
0
0
24 May 2022
The Solvability of Interpretability Evaluation Metrics
Yilun Zhou
J. Shah
76
8
0
18 May 2022
Clinical outcome prediction under hypothetical interventions -- a representation learning framework for counterfactual reasoning
Yikuan Li
M. Mamouei
Shishir Rao
A. Hassaine
D. Canoy
Thomas Lukasiewicz
K. Rahimi
G. Salimi-Khorshidi
OOD
CML
AI4CE
31
1
0
15 May 2022
The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations
Aparna Balagopalan
Haoran Zhang
Kimia Hamidieh
Thomas Hartvigsen
Frank Rudzicz
Marzyeh Ghassemi
38
78
0
06 May 2022
ExSum: From Local Explanations to Model Understanding
Yilun Zhou
Marco Tulio Ribeiro
J. Shah
FAtt
LRM
24
25
0
30 Apr 2022
It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers
Zheng Tang
Mihai Surdeanu
27
6
0
25 Apr 2022
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
34
10
0
31 Mar 2022
Towards Explainable Evaluation Metrics for Natural Language Generation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei-Ye Zhao
Yang Gao
Steffen Eger
AAML
ELM
30
20
0
21 Mar 2022
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
31
50
0
08 Mar 2022
Interpreting Language Models with Contrastive Explanations
Kayo Yin
Graham Neubig
MILM
23
78
0
21 Feb 2022
Diagnosing AI Explanation Methods with Folk Concepts of Behavior
Alon Jacovi
Jasmijn Bastings
Sebastian Gehrmann
Yoav Goldberg
Katja Filippova
36
15
0
27 Jan 2022
A Latent-Variable Model for Intrinsic Probing
Karolina Stañczak
Lucas Torroba Hennigen
Adina Williams
Ryan Cotterell
Isabelle Augenstein
29
4
0
20 Jan 2022
UNIREX: A Unified Learning Framework for Language Model Rationale Extraction
Aaron Chan
Maziar Sanjabi
Lambert Mathias
L Tan
Shaoliang Nie
Xiaochang Peng
Xiang Ren
Hamed Firooz
41
42
0
16 Dec 2021
Triggerless Backdoor Attack for NLP Tasks with Clean Labels
Leilei Gan
Jiwei Li
Tianwei Zhang
Xiaoya Li
Yuxian Meng
Fei Wu
Yi Yang
Shangwei Guo
Chun Fan
AAML
SILM
27
74
0
15 Nov 2021
Counterfactual Explanations for Models of Code
Jürgen Cito
Işıl Dillig
V. Murali
S. Chandra
AAML
LRM
32
48
0
10 Nov 2021
Understanding Interlocking Dynamics of Cooperative Rationalization
Mo Yu
Yang Zhang
Shiyu Chang
Tommi Jaakkola
22
41
0
26 Oct 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
19
45
0
20 Oct 2021
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
Andreas Madsen
Nicholas Meade
Vaibhav Adlakha
Siva Reddy
111
35
0
15 Oct 2021
Influence Tuning: Demoting Spurious Correlations via Instance Attribution and Instance-Driven Updates
Xiaochuang Han
Yulia Tsvetkov
TDI
31
30
0
07 Oct 2021
Counterfactual Evaluation for Explainable AI
Yingqiang Ge
Shuchang Liu
Zelong Li
Shuyuan Xu
Shijie Geng
Yunqi Li
Juntao Tan
Fei Sun
Yongfeng Zhang
CML
38
14
0
05 Sep 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
G. Chrysostomou
Nikolaos Aletras
32
16
0
31 Aug 2021
Neuron-level Interpretation of Deep NLP Models: A Survey
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
MILM
AI4CE
35
80
0
30 Aug 2021
Deep Active Learning for Text Classification with Diverse Interpretations
Qiang Liu
Yanqiao Zhu
Zhaocheng Liu
Yufeng Zhang
Shu Wu
AI4CE
33
14
0
15 Aug 2021
Inverting and Understanding Object Detectors
Ang Cao
Justin Johnson
ObjD
33
3
0
26 Jun 2021
1
2
3
Next