Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1607.06520
Cited By
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
21 July 2016
Tolga Bolukbasi
Kai-Wei Chang
James Zou
Venkatesh Saligrama
Adam Kalai
CVBM
FaML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings"
50 / 52 papers shown
Title
Fairness Practices in Industry: A Case Study in Machine Learning Teams Building Recommender Systems
Jing Nathan Yan
Junxiong Wang
Jeffrey M. Rzeszotarski
Allison Koenecke
FaML
41
0
0
26 May 2025
Trust Me, I Can Handle It: Self-Generated Adversarial Scenario Extrapolation for Robust Language Models
Md Rafi Ur Rashid
Vishnu Asutosh Dasu
Ye Wang
Gang Tan
Shagufta Mehnaz
AAML
ELM
54
0
0
20 May 2025
Do Large Language Models know who did what to whom?
Joseph M. Denning
Xiaohan
Bryor Snefjella
Idan A. Blank
123
1
0
23 Apr 2025
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji
L. Yu
Yeskendir Koishekenov
Yejin Bang
Anthony Hartshorn
Alan Schelten
Cheng Zhang
Pascale Fung
Nicola Cancedda
65
4
0
18 Mar 2025
Gender Encoding Patterns in Pretrained Language Model Representations
Mahdi Zakizadeh
Mohammad Taher Pilehvar
127
0
0
09 Mar 2025
Investigating the Relationship Between Debiasing and Artifact Removal using Saliency Maps
Lukasz Sztukiewicz
Ignacy Stepka
Michał Wiliński
Jerzy Stefanowski
98
0
0
28 Feb 2025
Is Free Self-Alignment Possible?
Dyah Adila
Changho Shin
Yijing Zhang
Frederic Sala
MoMe
139
2
0
24 Feb 2025
The Call for Socially Aware Language Technologies
Diyi Yang
Dirk Hovy
David Jurgens
Barbara Plank
VLM
95
11
0
24 Feb 2025
Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias
Enzo Doyen
Amalia Todirascu
62
0
0
14 Feb 2025
Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books
Sangmitra Madhusudan
Robert D Morabito
Skye Reid
Nikta Gohari Sadr
Ali Emami
89
1
0
07 Feb 2025
Large language models can replicate cross-cultural differences in personality
Paweł Niszczota
Mateusz Janczak
Michał Misiak
64
6
0
28 Jan 2025
The Goofus & Gallant Story Corpus for Practical Value Alignment
Md Sultan al Nahian
Tasmia Tasrin
Spencer Frazier
Mark O. Riedl
Brent Harrison
71
0
0
17 Jan 2025
Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts
Elizabeth Schaefer
Kirk Roberts
115
0
0
10 Jan 2025
Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
Martin Pawelczyk
Lillian Sun
Zhenting Qi
Aounon Kumar
Himabindu Lakkaraju
84
1
0
03 Jan 2025
Social Science Is Necessary for Operationalizing Socially Responsible Foundation Models
Adam Davies
Elisa Nguyen
Michael Simeone
Erik Johnston
Martin Gubri
142
0
0
20 Dec 2024
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
144
1
0
18 Dec 2024
Perception of Visual Content: Differences Between Humans and Foundation Models
Nardiena A. Pratama
Shaoyang Fan
Gianluca Demartini
VLM
119
0
0
28 Nov 2024
Controllable Context Sensitivity and the Knob Behind It
Julian Minder
Kevin Du
Niklas Stoehr
Giovanni Monea
Chris Wendler
Robert West
Ryan Cotterell
KELM
75
5
0
11 Nov 2024
Natural Language Processing for Human Resources: A Survey
Naoki Otani
Nikita Bhutani
Estevam R. Hruschka
VLM
52
0
0
21 Oct 2024
LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education
Iain Xie Weissburg
Sathvika Anand
Sharon Levy
Haewon Jeong
127
3
0
17 Oct 2024
Improving Instruction-Following in Language Models through Activation Steering
Alessandro Stolfo
Vidhisha Balachandran
Safoora Yousefi
Eric Horvitz
Besmira Nushi
LLMSV
80
21
0
15 Oct 2024
Organizing Unstructured Image Collections using Natural Language
Mingxuan Liu
Zhun Zhong
Jun Li
Gianni Franchi
Subhankar Roy
Elisa Ricci
VLM
87
3
0
07 Oct 2024
Collapsed Language Models Promote Fairness
Jingxuan Xu
Wuyang Chen
Linyi Li
Yao Zhao
Yunchao Wei
57
0
0
06 Oct 2024
Attention layers provably solve single-location regression
Pierre Marion
Raphael Berthier
Gérard Biau
Claire Boyer
332
4
0
02 Oct 2024
Mitigating Propensity Bias of Large Language Models for Recommender Systems
Guixian Zhang
Guan Yuan
Debo Cheng
Lin Liu
Jiuyong Li
Shichao Zhang
82
4
0
30 Sep 2024
Identity-related Speech Suppression in Generative AI Content Moderation
Oghenefejiro Isaacs Anigboro
Charlie M. Crawford
Danaë Metaxa
Sorelle A. Friedler
Sorelle A. Friedler
59
0
0
09 Sep 2024
Counterfactual Fairness by Combining Factual and Counterfactual Predictions
Zeyu Zhou
Tianci Liu
Ruqi Bai
Jing Gao
Murat Kocaoglu
David I. Inouye
73
2
0
03 Sep 2024
Multi-Output Distributional Fairness via Post-Processing
Gang Li
Qihang Lin
Ayush Ghosh
Tianbao Yang
106
0
0
31 Aug 2024
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models
Kunsheng Tang
Wenbo Zhou
Jie Zhang
Aishan Liu
Gelei Deng
Shuai Li
Peigui Qi
Weiming Zhang
Tianwei Zhang
Nenghai Yu
71
3
0
22 Aug 2024
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models
Hila Gonen
Terra Blevins
Alisa Liu
Luke Zettlemoyer
Noah A. Smith
67
5
0
12 Aug 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
85
8
0
02 Jul 2024
Exploring Safety-Utility Trade-Offs in Personalized Language Models
Anvesh Rao Vijjini
Somnath Basu Roy Chowdhury
Snigdha Chaturvedi
92
8
0
17 Jun 2024
Hire Me or Not? Examining Language Model's Behavior with Occupation Attributes
Damin Zhang
Yi Zhang
Geetanjali Bihani
Julia Taylor Rayz
78
2
0
06 May 2024
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
87
33
0
08 Apr 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
69
1
0
21 Feb 2024
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation
Kristian Lum
Jacy Reese Anthis
Chirag Nagpal
Alex DÁmour
Alexander D’Amour
61
16
0
20 Feb 2024
(Ir)rationality in AI: State of the Art, Research Challenges and Open Questions
Olivia Macmillan-Scott
Mirco Musolesi
56
1
0
28 Nov 2023
Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts
Christina Chance
Da Yin
Dakuo Wang
Kai-Wei Chang
40
0
0
16 Oct 2023
A Geometric Notion of Causal Probing
Clément Guerner
Anej Svete
Tianyu Liu
Alex Warstadt
Ryan Cotterell
LLMSV
58
12
0
27 Jul 2023
LEACE: Perfect linear concept erasure in closed form
Nora Belrose
David Schneider-Joseph
Shauli Ravfogel
Ryan Cotterell
Edward Raff
Stella Biderman
KELM
MU
58
107
0
06 Jun 2023
ADEPT: A DEbiasing PrompT Framework
Ke Yang
Charles Yu
Yi R. Fung
Manling Li
Heng Ji
42
24
0
10 Nov 2022
The Language Interpretability Tool: Extensible, Interactive Visualizations and Analysis for NLP Models
Ian Tenney
James Wexler
Jasmijn Bastings
Tolga Bolukbasi
Andy Coenen
...
Ellen Jiang
Mahima Pushkarna
Carey Radebaugh
Emily Reif
Ann Yuan
VLM
94
192
0
12 Aug 2020
Mitigating Gender Bias in Captioning Systems
Ruixiang Tang
Mengnan Du
Yuening Li
Zirui Liu
Na Zou
Xia Hu
FaML
34
66
0
15 Jun 2020
What is Fair? Exploring Pareto-Efficiency for Fairness Constrained Classifiers
Ananth Balashankar
Alyssa Lees
Chris Welty
L. Subramanian
33
21
0
30 Oct 2019
Semi-supervised Question Retrieval with Gated Convolutions
Tao Lei
Hrishikesh Joshi
Regina Barzilay
Tommi Jaakkola
K. Tymoshenko
Alessandro Moschitti
Lluís Màrquez i Villodre
RALM
48
106
0
17 Dec 2015
Certifying and removing disparate impact
Michael Feldman
Sorelle A. Friedler
John Moeller
C. Scheidegger
Suresh Venkatasubramanian
FaML
112
1,978
0
11 Dec 2014
Automated Experiments on Ad Privacy Settings: A Tale of Opacity, Choice, and Discrimination
Amit Datta
Michael Carl Tschantz
Anupam Datta
45
731
0
27 Aug 2014
Learning Word Representations with Hierarchical Sparse Coding
Dani Yogatama
Manaal Faruqui
Chris Dyer
Noah A. Smith
78
61
0
08 Jun 2014
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
275
33,445
0
16 Oct 2013
Domain and Function: A Dual-Space Model of Semantic Relations and Compositions
Peter D. Turney
CoGe
57
194
0
16 Sep 2013
1
2
Next