Incorporating Priors with Feature Attribution on Text Classification

19 June 2019

Papers citing "Incorporating Priors with Feature Attribution on Text Classification"

27 / 27 papers shown

Title
Large Language Models as Attribution Regularizers for Efficient Model Training Davor Vukadin Marin Šilić Goran Delač 41 0 0 27 Feb 2025
InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models Hao Li Xiaogeng Liu SILM 42 5 0 30 Oct 2024
Explanation Regularisation through the Lens of Attributions Pedro Ferreira Wilker Aziz Ivan Titov 43 1 0 23 Jul 2024
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales Lucas Resck Marcos M. Raimundo Jorge Poco 50 1 0 03 Apr 2024
Explaining black box text modules in natural language with language models Chandan Singh Aliyah R. Hsu Richard Antonello Shailee Jain Alexander G. Huth Bin-Xia Yu Jianfeng Gao MILM 34 47 0 17 May 2023
Going Beyond XAI: A Systematic Survey for Explanation-Guided Learning Yuyang Gao Siyi Gu Junji Jiang S. Hong Dazhou Yu Liang Zhao 29 39 0 07 Dec 2022
XMD: An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models Dong-Ho Lee Akshen Kadakia Brihi Joshi Aaron Chan Ziyi Liu ... Takashi Shibuya Ryosuke Mitani Toshiyuki Sekiya Jay Pujara Xiang Ren LRM 40 9 0 30 Oct 2022
Fairness via Adversarial Attribute Neighbourhood Robust Learning Q. Qi Shervin Ardeshir Yi Tian Xu Tianbao Yang 40 0 0 12 Oct 2022
Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection Tulika Bose Nikolaos Aletras Irina Illina Dominique Fohr 13 0 0 18 Sep 2022
Shortcut Learning of Large Language Models in Natural Language Understanding Mengnan Du Fengxiang He Na Zou Dacheng Tao Xia Hu KELM OffRL 34 84 0 25 Aug 2022
Neural Contrastive Clustering: Fully Unsupervised Bias Reduction for Sentiment Classification Jared Mowery SSL 20 0 0 22 Apr 2022
Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection Tulika Bose Nikolaos Aletras Irina Illina Dominique Fohr 45 5 0 23 Mar 2022
FairPrune: Achieving Fairness Through Pruning for Dermatological Disease Diagnosis Yawen Wu Dewen Zeng Xiaowei Xu Yiyu Shi Jingtong Hu MedIm 29 51 0 04 Mar 2022
Aligning Eyes between Humans and Deep Neural Network through Interactive Attention Alignment Yuyang Gao Tong Sun Liang Zhao Sungsoo Ray Hong HAI 21 37 0 06 Feb 2022
Modeling Techniques for Machine Learning Fairness: A Survey Mingyang Wan Daochen Zha Ninghao Liu Na Zou SyDa FaML 30 36 0 04 Nov 2021
Double Trouble: How to not explain a text classifier's decisions using counterfactuals synthesized by masked language models? Thang M. Pham Trung H. Bui Long Mai Anh Totti Nguyen 21 7 0 22 Oct 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience G. Chrysostomou Nikolaos Aletras 32 16 0 31 Aug 2021
EDITS: Modeling and Mitigating Data Bias for Graph Neural Networks Yushun Dong Ninghao Liu B. Jalaeian Jundong Li 28 117 0 11 Aug 2021
Fairness via Representation Neutralization Mengnan Du Subhabrata Mukherjee Guanchu Wang Ruixiang Tang Ahmed Hassan Awadallah Xia Hu 25 78 0 23 Jun 2021
Shapley Explanation Networks Rui Wang Xiaoqian Wang David I. Inouye TDI FAtt 21 44 0 06 Apr 2021
Efficient Explanations from Empirical Explainers Robert Schwarzenberg Nils Feldhus Sebastian Möller FAtt 32 9 0 29 Mar 2021
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers Hanjie Chen Yangfeng Ji AAML VLM 13 63 0 01 Oct 2020
Contextualizing Hate Speech Classifiers with Post-hoc Explanation Brendan Kennedy Xisen Jin Aida Mostafazadeh Davani Morteza Dehghani Xiang Ren 6 137 0 05 May 2020
Explaining Explanations: Axiomatic Feature Interactions for Deep Networks Joseph D. Janizek Pascal Sturmfels Su-In Lee FAtt 30 143 0 10 Feb 2020
Making deep neural networks right for the right scientific reasons by interacting with their explanations P. Schramowski Wolfgang Stammer Stefano Teso Anna Brugger Xiaoting Shao Hans-Georg Luigs Anne-Katrin Mahlein Kristian Kersting 32 207 0 15 Jan 2020
Learning Adversarially Fair and Transferable Representations David Madras Elliot Creager T. Pitassi R. Zemel FaML 233 674 0 17 Feb 2018
Convolutional Neural Networks for Sentence Classification Yoon Kim AILaw VLM 255 13,364 0 25 Aug 2014