Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

16 April 2020

Papers citing "Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection"

50 / 260 papers shown

Title
Competence-Based Analysis of Language Models Adam Davies Jize Jiang Chengxiang Zhai ELM 29 4 0 01 Mar 2023
Efficient fair PCA for fair representation learning Matthäus Kleindessner Michele Donini Chris Russell Muhammad Bilal Zafar FaML 28 14 0 26 Feb 2023
A Review of the Role of Causality in Developing Trustworthy AI Systems Niloy Ganguly Dren Fazlija Maryam Badar M. Fisichella Sandipan Sikdar ... Koustav Rudra Manolis Koubarakis Gourab K. Patro W. Z. E. Amri Wolfgang Nejdl CML 46 23 0 14 Feb 2023
Parameter-efficient Modularised Bias Mitigation via AdapterFusion Deepak Kumar Oleg Lesota George Zerveas Daniel Cohen Carsten Eickhoff Markus Schedl Navid Rekabsaz MoMe KELM 31 25 0 13 Feb 2023
Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP Xudong Han Timothy Baldwin Trevor Cohn 34 12 0 11 Feb 2023
Erasure of Unaligned Attributes from Neural Representations Shun Shao Yftah Ziser Shay B. Cohen 14 9 0 06 Feb 2023
Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples Masahiro Kaneko Danushka Bollegala Naoaki Okazaki 37 9 0 28 Jan 2023
Break It Down: Evidence for Structural Compositionality in Neural Networks Michael A. Lepori Thomas Serre Ellie Pavlick 46 30 0 26 Jan 2023
Joint processing of linguistic properties in brains and language models S. Oota Manish Gupta Mariya Toneva 24 26 0 15 Dec 2022
Better Hit the Nail on the Head than Beat around the Bush: Removing Protected Attributes with a Single Projection P. Haghighatkhah Antske Fokkens Pia Sommerauer Bettina Speckmann Kevin Verbeek 32 10 0 08 Dec 2022
Conceptor-Aided Debiasing of Large Language Models Yifei Li Lyle Ungar João Sedoc 14 4 0 20 Nov 2022
Does Debiasing Inevitably Degrade the Model Performance Yiran Liu Xiao-Yang Liu Haotian Chen Yang Yu 41 2 0 14 Nov 2022
Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control Xiang Fan Yiwei Lyu Paul Pu Liang Ruslan Salakhutdinov Louis-Philippe Morency BDL 32 6 0 10 Nov 2022
Bridging Fairness and Environmental Sustainability in Natural Language Processing Marius Hessenthaler Emma Strubell Dirk Hovy Anne Lauscher 29 8 0 08 Nov 2022
Fair and Optimal Classification via Post-Processing Ruicheng Xian Lang Yin Han Zhao FaML 21 30 0 03 Nov 2022
MABEL: Attenuating Gender Bias using Textual Entailment Data Jacqueline He Mengzhou Xia C. Fellbaum Danqi Chen 32 32 0 26 Oct 2022
Causal Analysis of Syntactic Agreement Neurons in Multilingual Language Models Aaron Mueller Yudi Xia Tal Linzen MILM 44 9 0 25 Oct 2022
Are All Spurious Features in Natural Language Alike? An Analysis through a Causal Lens Nitish Joshi X. Pan Hengxing He CML 64 30 0 25 Oct 2022
Spectral Probing Max Müller-Eberstein Rob van der Goot Barbara Plank 17 2 0 21 Oct 2022
Towards Procedural Fairness: Uncovering Biases in How a Toxic Language Classifier Uses Sentiment Information I. Nejadgholi Esma Balkir Kathleen C. Fraser S. Kiritchenko 40 3 0 19 Oct 2022
Log-linear Guardedness and its Implications Shauli Ravfogel Yoav Goldberg Ryan Cotterell 28 2 0 18 Oct 2022
Systematic Evaluation of Predictive Fairness Xudong Han Aili Shen Trevor Cohn Timothy Baldwin Lea Frermann 32 7 0 17 Oct 2022
Controlling Bias Exposure for Fair Interpretable Predictions Zexue He Yu Wang Julian McAuley Bodhisattwa Prasad Majumder 27 19 0 14 Oct 2022
InterFair: Debiasing with Natural Language Feedback for Fair Interpretable Predictions Bodhisattwa Prasad Majumder Zexue He Julian McAuley 21 6 0 14 Oct 2022
SODAPOP: Open-Ended Discovery of Social Biases in Social Commonsense Reasoning Models Haozhe An Zongxia Li Jieyu Zhao Rachel Rudinger 30 25 0 13 Oct 2022
Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks Masahiro Kaneko Danushka Bollegala Naoaki Okazaki 28 41 0 06 Oct 2022
Causal Proxy Models for Concept-Based Model Explanations Zhengxuan Wu Karel DÓosterlinck Atticus Geiger Amir Zur Christopher Potts MILM 83 35 0 28 Sep 2022
Towards Faithful Model Explanation in NLP: A Survey Qing Lyu Marianna Apidianaki Chris Callison-Burch XAI 117 110 0 22 Sep 2022
Sustaining Fairness via Incremental Learning Somnath Basu Roy Chowdhury Snigdha Chaturvedi FaML CLL 22 4 0 25 Aug 2022
Visual Comparison of Language Model Adaptation Rita Sevastjanova E. Cakmak Shauli Ravfogel Ryan Cotterell Mennatallah El-Assady VLM 49 16 0 17 Aug 2022
What Artificial Neural Networks Can Tell Us About Human Language Acquisition Alex Warstadt Samuel R. Bowman 27 111 0 17 Aug 2022
Unit Testing for Concepts in Neural Networks Charles Lovering Ellie Pavlick 25 28 0 28 Jul 2022
The Birth of Bias: A case study on the evolution of gender bias in an English language model Oskar van der Wal Jaap Jumelet K. Schulz Willem H. Zuidema 32 16 0 21 Jul 2022
Probing Classifiers are Unreliable for Concept Removal and Detection Abhinav Kumar Chenhao Tan Amit Sharma AAML 34 21 0 08 Jul 2022
Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information Tomasz Limisiewicz David Marecek 24 18 0 21 Jun 2022
What Changed? Investigating Debiasing Methods using Causal Mediation Analysis Su-Ha Jeoung Jana Diesner CML 27 7 0 01 Jun 2022
Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation Verna Dankers Christopher G. Lucas Ivan Titov 43 36 0 30 May 2022
Modular and On-demand Bias Mitigation with Attribute-Removal Subnetworks Lukas Hauzenberger Shahed Masoudian Deepak Kumar Markus Schedl Navid Rekabsaz 30 17 0 30 May 2022
CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior Eldar David Abraham Karel DÓosterlinck Amir Feder Y. Gat Atticus Geiger Christopher Potts Roi Reichart Zhengxuan Wu CML 36 44 0 27 May 2022
Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation Yanrui Du Jing Yang Yan Chen Jing Liu Sendong Zhao Qiaoqiao She Huaqin Wu Haifeng Wang Bing Qin 44 9 0 25 May 2022
Conditional Supervised Contrastive Learning for Fair Text Classification Jianfeng Chi Will Shand Yaodong Yu Kai-Wei Chang Han Zhao Yuan Tian FaML 51 14 0 23 May 2022
Gender Bias in Meta-Embeddings Masahiro Kaneko Danushka Bollegala Naoaki Okazaki 36 6 0 19 May 2022
Towards Debiasing Translation Artifacts Koel Dutta Chowdhury Rricha Jalota C. España-Bonet Josef van Genabith 31 6 0 16 May 2022
Naturalistic Causal Probing for Morpho-Syntax Afra Amini Tiago Pimentel Clara Meister Ryan Cotterell MILM 108 19 0 14 May 2022
Fair NLP Models with Differentially Private Text Encoders Gaurav Maheshwari Pascal Denis Mikaela Keller A. Bellet FedML SILM 36 15 0 12 May 2022
Learning Disentangled Textual Representations via Statistical Measures of Similarity Pierre Colombo Guillaume Staerman Nathan Noiry Pablo Piantanida FaML DRL 38 22 0 07 May 2022
Theories of "Gender" in NLP Bias Research Hannah Devinney Jenny Björklund H. Björklund AI4CE 20 66 0 05 May 2022
Optimising Equal Opportunity Fairness in Model Training Aili Shen Xudong Han Trevor Cohn Timothy Baldwin Lea Frermann FaML 32 28 0 05 May 2022
fairlib: A Unified Framework for Assessing and Improving Classification Fairness Xudong Han Aili Shen Yitong Li Lea Frermann Timothy Baldwin Trevor Cohn VLM FaML 28 12 0 04 May 2022
Conceptualizing Treatment Leakage in Text-based Causal Inference Adel Daoud Connor Jerzak Richard Johansson CML 22 9 0 01 May 2022