Incorporating Priors with Feature Attribution on Text Classification

19 June 2019

Papers citing "Incorporating Priors with Feature Attribution on Text Classification"

23 / 23 papers shown

Title
Large Language Models as Attribution Regularizers for Efficient Model Training Davor Vukadin Marin Šilić Goran Delač 36 0 0 27 Feb 2025
InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models H. Li Xiaogeng Liu SILM 42 4 0 30 Oct 2024
Explanation Regularisation through the Lens of Attributions Pedro Ferreira Wilker Aziz Ivan Titov 36 1 0 23 Jul 2024
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales Lucas Resck Marcos M. Raimundo Jorge Poco 42 1 0 03 Apr 2024
Explaining black box text modules in natural language with language models Chandan Singh Aliyah R. Hsu Richard Antonello Shailee Jain Alexander G. Huth Bin-Xia Yu Jianfeng Gao MILM 21 46 0 17 May 2023
XMD: An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models Dong-Ho Lee Akshen Kadakia Brihi Joshi Aaron Chan Ziyi Liu ... Takashi Shibuya Ryosuke Mitani Toshiyuki Sekiya Jay Pujara Xiang Ren LRM 40 9 0 30 Oct 2022
Fairness via Adversarial Attribute Neighbourhood Robust Learning Q. Qi Shervin Ardeshir Yi Tian Xu Tianbao Yang 35 0 0 12 Oct 2022
Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection Tulika Bose Nikolaos Aletras Irina Illina Dominique Fohr 11 0 0 18 Sep 2022
Shortcut Learning of Large Language Models in Natural Language Understanding Mengnan Du Fengxiang He Na Zou Dacheng Tao Xia Hu KELM OffRL 28 83 0 25 Aug 2022
Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection Tulika Bose Nikolaos Aletras Irina Illina Dominique Fohr 40 5 0 23 Mar 2022
FairPrune: Achieving Fairness Through Pruning for Dermatological Disease Diagnosis Yawen Wu Dewen Zeng Xiaowei Xu Yiyu Shi Jingtong Hu MedIm 21 51 0 04 Mar 2022
Aligning Eyes between Humans and Deep Neural Network through Interactive Attention Alignment Yuyang Gao Tong Sun Liang Zhao Sungsoo Ray Hong HAI 21 37 0 06 Feb 2022
Modeling Techniques for Machine Learning Fairness: A Survey Mingyang Wan Daochen Zha Ninghao Liu Na Zou SyDa FaML 24 36 0 04 Nov 2021
Double Trouble: How to not explain a text classifier's decisions using counterfactuals synthesized by masked language models? Thang M. Pham Trung H. Bui Long Mai Anh Totti Nguyen 21 7 0 22 Oct 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience G. Chrysostomou Nikolaos Aletras 24 16 0 31 Aug 2021
Fairness via Representation Neutralization Mengnan Du Subhabrata Mukherjee Guanchu Wang Ruixiang Tang Ahmed Hassan Awadallah Xia Hu 23 76 0 23 Jun 2021
Shapley Explanation Networks Rui Wang Xiaoqian Wang David I. Inouye TDI FAtt 19 44 0 06 Apr 2021
Efficient Explanations from Empirical Explainers Robert Schwarzenberg Nils Feldhus Sebastian Möller FAtt 27 9 0 29 Mar 2021
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers Hanjie Chen Yangfeng Ji AAML VLM 13 62 0 01 Oct 2020
Explaining Explanations: Axiomatic Feature Interactions for Deep Networks Joseph D. Janizek Pascal Sturmfels Su-In Lee FAtt 24 143 0 10 Feb 2020
Making deep neural networks right for the right scientific reasons by interacting with their explanations P. Schramowski Wolfgang Stammer Stefano Teso Anna Brugger Xiaoting Shao Hans-Georg Luigs Anne-Katrin Mahlein Kristian Kersting 15 207 0 15 Jan 2020
Learning Adversarially Fair and Transferable Representations David Madras Elliot Creager T. Pitassi R. Zemel FaML 233 673 0 17 Feb 2018
Convolutional Neural Networks for Sentence Classification Yoon Kim AILaw VLM 255 13,364 0 25 Aug 2014