Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them

9 March 2019

Papers citing "Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them"

50 / 307 papers shown

Title
From Prejudice to Parity: A New Approach to Debiasing Large Language Model Word Embeddings Aishik Rakshit Smriti Singh Shuvam Keshari Arijit Ghosh Chowdhury Vinija Jain Aman Chadha 37 0 0 18 Feb 2024
Representation Surgery: Theory and Practice of Affine Steering Shashwat Singh Shauli Ravfogel Jonathan Herzig Roee Aharoni Ryan Cotterell Ponnurangam Kumaraguru LLMSV 35 13 0 15 Feb 2024
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting Masahiro Kaneko Danushka Bollegala Naoaki Okazaki Timothy Baldwin LRM 37 27 0 28 Jan 2024
Semantic Properties of cosine based bias scores for word embeddings Sarah Schröder Alexander Schulz Fabian Hinder Barbara Hammer 29 1 0 27 Jan 2024
A Comprehensive View of the Biases of Toxicity and Sentiment Analysis Methods Towards Utterances with African American English Expressions Guilherme H. Resende L. F. Nery Fabrício Benevenuto Savvas Zannettou Flavio Figueiredo 22 6 0 23 Jan 2024
Gender Bias in Machine Translation and The Era of Large Language Models Eva Vanmassenhove AILaw 32 1 0 18 Jan 2024
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs Shengbang Tong Zhuang Liu Yuexiang Zhai Yi Ma Yann LeCun Saining Xie VLM MLLM 41 286 0 11 Jan 2024
Whose wife is it anyway? Assessing bias against same-gender relationships in machine translation Ian Stewart Rada Mihalcea 27 4 0 10 Jan 2024
SutraNets: Sub-series Autoregressive Networks for Long-Sequence, Probabilistic Forecasting Shane Bergsma Timothy J. Zeyl Lei Guo AI4TS 38 3 0 22 Dec 2023
What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations Raphael Tang Xinyu Crystina Zhang Jimmy J. Lin Ferhan Ture 38 7 0 30 Nov 2023
Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads Yi Yang Hanyu Duan Ahmed Abbasi John P. Lalor Kar Yan Tam 19 6 0 17 Nov 2023
Do Not Harm Protected Groups in Debiasing Language Representation Models Chloe Qinyu Zhu Rickard Stureborg Brandon Fain 29 0 0 27 Oct 2023
Is Probing All You Need? Indicator Tasks as an Alternative to Probing Embedding Spaces Tal Levy Omer Goldman Reut Tsarfaty 19 3 0 24 Oct 2023
Evaluating the Fairness of Discriminative Foundation Models in Computer Vision Junaid Ali Matthäus Kleindessner F. Wenzel Kailash Budhathoki V. Cevher Chris Russell VLM 67 10 0 18 Oct 2023
The Impact of Explanations on Fairness in Human-AI Decision-Making: Protected vs Proxy Features Navita Goyal Connor Baumler Tin Nguyen Hal Daumé 26 6 0 12 Oct 2023
Survey of Social Bias in Vision-Language Models Nayeon Lee Yejin Bang Holy Lovenia Samuel Cahyawijaya Wenliang Dai Pascale Fung VLM 47 16 0 24 Sep 2023
Large language models can accurately predict searcher preferences Paul Thomas S. Spielman Nick Craswell Bhaskar Mitra ALM LRM 30 142 0 19 Sep 2023
A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings Danushka Bollegala Shuichi Otake T. Machide Ken-ichi Kawarabayashi 16 4 0 19 Sep 2023
DiffusionWorldViewer: Exposing and Broadening the Worldview Reflected by Generative Text-to-Image Models Zoe De Simone Angie Boggust Arvindmani Satyanarayan Ashia Wilson 36 1 0 18 Sep 2023
Bias of AI-Generated Content: An Examination of News Produced by Large Language Models Xiao Fang Shangkun Che Minjia Mao Hongzhe Zhang Ming Zhao Xiaohang Zhao 38 79 0 18 Sep 2023
In-Contextual Gender Bias Suppression for Large Language Models Daisuke Oba Masahiro Kaneko Danushka Bollegala 31 8 0 13 Sep 2023
Bias and Fairness in Large Language Models: A Survey Isabel O. Gallegos Ryan A. Rossi Joe Barrow Md Mehrab Tanjim Sungchul Kim Franck Dernoncourt Tong Yu Ruiyi Zhang Nesreen Ahmed AILaw 32 493 0 02 Sep 2023
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection Fatma Elsafoury 29 3 0 31 Aug 2023
FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models Yanhong Bai Jiabao Zhao Jinxin Shi Tingjiang Wei Xingjiao Wu Liangbo He 36 0 0 21 Aug 2023
Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias Itay Itzhak Gabriel Stanovsky Nir Rosenfeld Yonatan Belinkov 27 20 0 01 Aug 2023
The Resume Paradox: Greater Language Differences, Smaller Pay Gaps J. Minot Marc E. Maier Bradford Demarest Nicholas Cheney C. Danforth P. Dodds M. Frank 11 0 0 17 Jul 2023
Learning to Generate Equitable Text in Dialogue from Biased Training Data Anthony Sicilia Malihe Alikhani 49 15 0 10 Jul 2023
Evaluating Biased Attitude Associations of Language Models in an Intersectional Context Shiva Omrani Sabbaghi Robert Wolfe Aylin Caliskan 26 22 0 07 Jul 2023
Mass-Producing Failures of Multimodal Systems with Language Models Shengbang Tong Erik Jones Jacob Steinhardt 46 33 0 21 Jun 2023
A Bayesian approach to uncertainty in word embedding bias estimation Alicja Dobrzeniecka R. Urbaniak 22 1 0 15 Jun 2023
Sociodemographic Bias in Language Models: A Survey and Forward Path Vipul Gupta Pranav Narayanan Venkit Shomir Wilson R. Passonneau 44 21 0 13 Jun 2023
Are fairness metric scores enough to assess discrimination biases in machine learning? Fanny Jourdan Laurent Risser Jean-Michel Loubes Nicholas M. Asher FaML 16 5 0 08 Jun 2023
Stubborn Lexical Bias in Data and Models Sofia Serrano Jesse Dodge Noah A. Smith 37 2 0 03 Jun 2023
Pointwise Representational Similarity Camila Kolling Till Speicher Vedant Nanda Mariya Toneva Krishna P. Gummadi 23 1 0 30 May 2023
What about em? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns Anne Lauscher Debora Nozza Archie Crowley E. Miltersen Dirk Hovy 26 21 0 25 May 2023
Psychological Metrics for Dialog System Evaluation Salvatore Giorgi Shreya Havaldar Farhan S. Ahmed Zuhaib Akhtar Shalaka Vaidya Gary Pan Pallavi V. Kulkarni H. Andrew Schwartz Joao Sedoc 22 2 0 24 May 2023
Detecting and Mitigating Indirect Stereotypes in Word Embeddings Erin E. George Joyce A. Chew Deanna Needell 24 0 0 23 May 2023
Out-of-Distribution Generalization in Text Classification: Past, Present, and Future Linyi Yang Yangqiu Song Xuan Ren Chenyang Lyu Yidong Wang Lingqiao Liu Jindong Wang Jennifer Foster Yue Zhang OOD 37 2 0 23 May 2023
Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation Minwoo Lee Hyukhun Koh Kang-il Lee Dongdong Zhang Minsu Kim Kyomin Jung 35 9 0 23 May 2023
On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection Fatma Elsafoury Stamos Katsigiannis 32 1 0 22 May 2023
Shielded Representations: Protecting Sensitive Attributes Through Iterative Gradient-Based Projection Shadi Iskander Kira Radinsky Yonatan Belinkov 38 17 0 17 May 2023
On the Origins of Bias in NLP through the Lens of the Jim Code Fatma Elsafoury Gavin Abercrombie 47 4 0 16 May 2023
Surfacing Biases in Large Language Models using Contrastive Input Decoding G. Yona Or Honovich Itay Laish Roee Aharoni 27 11 0 12 May 2023
iLab at SemEval-2023 Task 11 Le-Wi-Di: Modelling Disagreement or Modelling Perspectives? Nikolas Vitsakis Amit Parekh Tanvi Dinkar Gavin Abercrombie Ioannis Konstas Verena Rieser 62 10 0 10 May 2023
SkillQG: Learning to Generate Question for Reading Comprehension Assessment Xiaoqiang Wang Bang Liu Siliang Tang Lingfei Wu 25 3 0 08 May 2023
On the Independence of Association Bias and Empirical Fairness in Language Models Laura Cabello Anna Katrine van Zee Anders Søgaard 26 26 0 20 Apr 2023
How optimal transport can tackle gender biases in multi-class neural-network classifiers for job recommendations? Fanny Jourdan Titon Tshiongo Kaninku Nicholas M. Asher Jean-Michel Loubes Laurent Risser FaML 26 4 0 27 Feb 2023
Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns Zhongbin Xie Vid Kocijan Thomas Lukasiewicz Oana-Maria Camburu 10 2 0 11 Feb 2023
Concept Algebra for (Score-Based) Text-Controlled Generative Models Zihao Wang Lin Gui Jeffrey Negrea Victor Veitch CoGe DiffM 32 32 0 07 Feb 2023
Erasure of Unaligned Attributes from Neural Representations Shun Shao Yftah Ziser Shay B. Cohen 14 9 0 06 Feb 2023