Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

16 April 2020

Papers citing "Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection"

50 / 260 papers shown

Title
When Does Syntax Mediate Neural Language Model Performance? Evidence from Dropout Probes Mycal Tucker Tiwalayo Eisape Peng Qian R. Levy J. Shah MILM 28 12 0 20 Apr 2022
Analyzing Gender Representation in Multilingual Models Hila Gonen Shauli Ravfogel Yoav Goldberg 25 11 0 20 Apr 2022
Probing for the Usage of Grammatical Number Karim Lasri Tiago Pimentel Alessandro Lenci Thierry Poibeau Ryan Cotterell 38 56 0 19 Apr 2022
How Gender Debiasing Affects Internal Model Representations, and Why It Matters Hadas Orgad Seraphina Goldfarb-Tarrant Yonatan Belinkov 28 18 0 14 Apr 2022
Visualizing the Relationship Between Encoded Linguistic Information and Task Performance Jiannan Xiang Huayang Li Defu Lian Guoping Huang Taro Watanabe Lemao Liu 42 0 0 29 Mar 2022
Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists Giuseppe Attanasio Debora Nozza Dirk Hovy Elena Baralis 36 54 0 17 Mar 2022
Gold Doesn't Always Glitter: Spectral Removal of Linear and Nonlinear Guarded Attribute Information Shun Shao Yftah Ziser Shay B. Cohen AAML 16 25 0 15 Mar 2022
Sense Embeddings are also Biased--Evaluating Social Biases in Static and Contextualised Sense Embeddings Yi Zhou Masahiro Kaneko Danushka Bollegala 39 23 0 14 Mar 2022
Towards Equal Opportunity Fairness through Adversarial Learning Xudong Han Timothy Baldwin Trevor Cohn FaML 25 8 0 12 Mar 2022
On the data requirements of probing Zining Zhu Jixuan Wang Bai Li Frank Rudzicz 27 5 0 25 Feb 2022
CAREER: A Foundation Model for Labor Sequence Data Keyon Vafa Emil Palikot Tianyu Du Ayush Kanodia Susan Athey David M. Blei 30 5 0 16 Feb 2022
A Differential Entropy Estimator for Training Neural Networks Georg Pichler Pierre Colombo Malik Boudiaf Günther Koliander Pablo Piantanida 25 21 0 14 Feb 2022
Learning Fair Representations via Rate-Distortion Maximization Somnath Basu Roy Chowdhury Snigdha Chaturvedi FaML 8 14 0 31 Jan 2022
Kernelized Concept Erasure Shauli Ravfogel Francisco Vargas Yoav Goldberg Ryan Cotterell 29 32 0 28 Jan 2022
Linear Adversarial Concept Erasure Shauli Ravfogel Michael Twiton Yoav Goldberg Ryan Cotterell KELM 84 57 0 28 Jan 2022
Privacy-aware Early Detection of COVID-19 through Adversarial Training Omid Rohanian Samaneh Kouchaki A. Soltan Jenny Yang Morteza Rohanian Yang Yang David Clifton AAML OOD 36 6 0 09 Jan 2022
Inducing Causal Structure for Interpretable Neural Networks Atticus Geiger Zhengxuan Wu Hanson Lu J. Rozner Elisa Kreiss Thomas Icard Noah D. Goodman Christopher Potts CML OOD 35 71 0 01 Dec 2021
Evaluating Metrics for Bias in Word Embeddings Sarah Schröder Alexander Schulz Philip Kenneweg Robert Feldhans Fabian Hinder Barbara Hammer 21 10 0 15 Nov 2021
An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models Nicholas Meade Elinor Poole-Dayan Siva Reddy 22 124 0 16 Oct 2021
Second Order WinoBias (SoWinoBias) Test Set for Latent Gender Bias Detection in Coreference Resolution Hillary Dawkins 14 0 0 28 Sep 2021
Marked Attribute Bias in Natural Language Inference Hillary Dawkins 52 8 0 28 Sep 2021
Contrastive Learning for Fair Representations Aili Shen Xudong Han Trevor Cohn Timothy Baldwin Lea Frermann FaML 42 32 0 22 Sep 2021
Fairness-aware Class Imbalanced Learning Shivashankar Subramanian Afshin Rahimi Timothy Baldwin Trevor Cohn Lea Frermann FaML 109 28 0 21 Sep 2021
Evaluating Debiasing Techniques for Intersectional Biases Shivashankar Subramanian Xudong Han Timothy Baldwin Trevor Cohn Lea Frermann 110 48 0 21 Sep 2021
Balancing out Bias: Achieving Fairness Through Balanced Training Xudong Han Timothy Baldwin Trevor Cohn 26 39 0 16 Sep 2021
Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness Fatemehsadat Mireshghallah Taylor Berg-Kirkpatrick 31 12 0 10 Sep 2021
Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond Amir Feder Katherine A. Keith Emaad A. Manzoor Reid Pryzant Dhanya Sridhar ... Roi Reichart Margaret E. Roberts Brandon M Stewart Victor Veitch Diyi Yang CML 41 235 0 02 Sep 2021
Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies Sunipa Dev Masoud Monajatipoor Anaelia Ovalle Arjun Subramonian J. M. Phillips Kai-Wei Chang 39 165 0 27 Aug 2021
On Measures of Biases and Harms in NLP Sunipa Dev Emily Sheng Jieyu Zhao Aubrie Amstutz Jiao Sun ... M. Sanseverino Jiin Kim Akihiro Nishi Nanyun Peng Kai-Wei Chang 33 80 0 07 Aug 2021
Debiasing Multilingual Word Embeddings: A Case Study of Three Indian Languages Srijan Bansal Vishal Garimella Ayush Suhane Animesh Mukherjee 30 9 0 21 Jul 2021
Towards Understanding and Mitigating Social Biases in Language Models Paul Pu Liang Chiyu Wu Louis-Philippe Morency Ruslan Salakhutdinov 36 380 0 24 Jun 2021
Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces Vanessa Hahn Dana Ruiter Thomas Kleinbauer Dietrich Klakow 13 7 0 14 Jun 2021
Uncovering Constraint-Based Behavior in Neural Models via Targeted Fine-Tuning Forrest Davis Marten van Schijndel AI4CE 17 7 0 02 Jun 2021
Obstructing Classification via Projection P. Haghighatkhah Wouter Meulemans Bettina Speckmann Jérôme Urhausen Kevin Verbeek 38 6 0 19 May 2021
The Low-Dimensional Linear Geometry of Contextualized Word Representations Evan Hernandez Jacob Andreas MILM 28 40 0 15 May 2021
Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction Shauli Ravfogel Grusha Prasad Tal Linzen Yoav Goldberg 36 57 0 14 May 2021
StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer Yiwei Lyu Paul Pu Liang Hai Pham Eduard H. Hovy Barnabas Poczos Ruslan Salakhutdinov Louis-Philippe Morency 27 41 0 12 Apr 2021
VERB: Visualizing and Interpreting Bias Mitigation Techniques for Word Representations Archit Rathore Sunipa Dev J. M. Phillips Vivek Srikumar Yan Zheng Chin-Chia Michael Yeh Junpeng Wang Wei Zhang Bei Wang 43 10 0 06 Apr 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics Vassilina Nikoulina Maxat Tezekbayev Nuradil Kozhakhmet Madina Babazhanova Matthias Gallé Z. Assylbekov 34 8 0 02 Mar 2021
Contrastive Explanations for Model Interpretability Alon Jacovi Swabha Swayamdipta Shauli Ravfogel Yanai Elazar Yejin Choi Yoav Goldberg 44 95 0 02 Mar 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP Timo Schick Sahana Udupa Hinrich Schütze 265 374 0 28 Feb 2021
Diverse Adversaries for Mitigating Bias in Training Xudong Han Timothy Baldwin Trevor Cohn 13 62 0 25 Jan 2021
Dictionary-based Debiasing of Pre-trained Word Embeddings Masahiro Kaneko Danushka Bollegala FaML 38 38 0 23 Jan 2021
Debiasing Pre-trained Contextualised Embeddings Masahiro Kaneko Danushka Bollegala 218 137 0 23 Jan 2021
The Geometry of Distributed Representations for Better Alignment, Attenuated Bias, and Improved Interpretability Sunipa Dev 37 1 0 25 Nov 2020
Exploring Text Specific and Blackbox Fairness Algorithms in Multimodal Clinical NLP John Chen Ian Berlot-Attwell Safwan Hossain Xindi Wang Frank Rudzicz FaML 37 7 0 19 Nov 2020
Fairness and Robustness in Invariant Learning: A Case Study in Toxicity Classification Robert Adragna Elliot Creager David Madras R. Zemel OOD FaML 37 41 0 12 Nov 2020
On Transferability of Bias Mitigation Effects in Language Model Fine-Tuning Xisen Jin Francesco Barbieri Brendan Kennedy Aida Mostafazadeh Davani Leonardo Neves Xiang Ren 35 5 0 24 Oct 2020
It's not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT Hila Gonen Shauli Ravfogel Yanai Elazar Yoav Goldberg 26 50 0 16 Oct 2020
PrivNet: Safeguarding Private Attributes in Transfer Learning for Recommendation Guangneng Hu Qiang Yang 14 6 0 16 Oct 2020