Towards Procedural Fairness: Uncovering Biases in How a Toxic Language Classifier Uses Sentiment Information

19 October 2022

Papers citing "Towards Procedural Fairness: Uncovering Biases in How a Toxic Language Classifier Uses Sentiment Information"

6 / 6 papers shown

Title
A Note on Bias to Complete Jia Xu Mona Diab 47 2 0 18 Feb 2024
Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers I. Nejadgholi S. Kiritchenko Kathleen C. Fraser Esma Balkir 26 0 0 04 Jul 2023
Probing Classifiers: Promises, Shortcomings, and Advances Yonatan Belinkov 226 405 0 24 Feb 2021
On Completeness-aware Concept-Based Explanations in Deep Neural Networks Chih-Kuan Yeh Been Kim Sercan Ö. Arik Chun-Liang Li Tomas Pfister Pradeep Ravikumar FAtt 122 297 0 17 Oct 2019
What you can cram into a single vector: Probing sentence embeddings for linguistic properties Alexis Conneau Germán Kruszewski Guillaume Lample Loïc Barrault Marco Baroni 201 882 0 03 May 2018
Towards A Rigorous Science of Interpretable Machine Learning Finale Doshi-Velez Been Kim XAI FaML 254 3,684 0 28 Feb 2017