Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2210.02938
Cited By
Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks
International Conference on Computational Linguistics (COLING), 2022
6 October 2022
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks"
43 / 43 papers shown
Title
Online Learning Defense against Iterative Jailbreak Attacks via Prompt Optimization
Masahiro Kaneko
Zeerak Talat
Timothy Baldwin
AAML
61
1
0
19 Oct 2025
Once Is Enough: Lightweight DiT-Based Video Virtual Try-On via One-Time Garment Appearance Injection
Yanjie Pan
Qingdong He
Lidong Wang
Bo Peng
Mingmin Chi
DiffM
VGen
43
0
0
09 Oct 2025
Bias after Prompting: Persistent Discrimination in Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025
N. Sivakumar
Natalie Mackraz
Samira Khorshidi
Krishna Patel
B. Theobald
Luca Zappella
N. Apostoloff
AI4CE
48
1
0
09 Sep 2025
Do Biased Models Have Biased Thoughts?
Swati Rajwal
Shivank Garg
Reem Abdel-Salam
Abdelrahman Zayed
LRM
120
0
0
08 Aug 2025
Exploring Gender Bias in Large Language Models: An In-depth Dive into the German Language
Kristin Gnadt
David Thulke
Simone Kopeinik
Ralf Schluter
121
0
0
22 Jul 2025
Safety Alignment via Constrained Knowledge Unlearning
Zesheng Shi
Yucheng Zhou
Jing Li
MU
KELM
AAML
172
4
0
24 May 2025
Evaluating the Effect of Retrieval Augmentation on Social Biases
Tianhui Zhang
Yi Zhou
Danushka Bollegala
185
1
0
24 Feb 2025
Smaller Large Language Models Can Do Moral Self-Correction
Guangliang Liu
Zhiyu Xue
Rongrong Wang
K. Johnson
Kristen Marie Johnson
LRM
253
2
0
30 Oct 2024
BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs
Zhiting Fan
Ruizhe Chen
Ruiling Xu
Zuozhu Liu
KELM
202
29
0
14 Jul 2024
Social Bias Evaluation for Large Language Models Requires Prompt Variations
Rem Hida
Masahiro Kaneko
Naoaki Okazaki
183
27
0
03 Jul 2024
A Study of Nationality Bias in Names and Perplexity using Off-the-Shelf Affect-related Tweet Classifiers
Valentin Barriere
Sebastian Cifuentes
140
3
0
01 Jul 2024
Why Don't Prompt-Based Fairness Metrics Correlate?
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
A. Zayed
Gonçalo Mordido
Ioana Baldini
Sarath Chandar
ALM
229
7
0
09 Jun 2024
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Guangliang Liu
Milad Afshari
Xitong Zhang
Zhiyu Xue
Avrajit Ghosh
Bidhan Bashyal
Rongrong Wang
K. Johnson
125
2
0
06 Jun 2024
Anna Karenina Strikes Again: Pre-Trained LLM Embeddings May Favor High-Performing Learners
Abigail Gurin Schleifer
Beata Beigman Klebanov
Moriah Ariely
Giora Alexandron
131
5
0
06 Jun 2024
On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept
Guangliang Liu
Haitao Mao
Bochuan Cao
Zhiyu Xue
K. Johnson
Shucheng Zhou
Rongrong Wang
LRM
160
16
0
04 Jun 2024
Exploring Subjectivity for more Human-Centric Assessment of Social Biases in Large Language Models
Paula Akemi Aoyagui
Sharon Ferguson
Anastasia Kuzminykh
184
1
0
17 May 2024
Twists, Humps, and Pebbles: Multilingual Speech Recognition Models Exhibit Gender Performance Gaps
Giuseppe Attanasio
Beatrice Savoldi
Dennis Fucci
Dirk Hovy
153
12
0
28 Feb 2024
Eagle: Ethical Dataset Given from Real Interactions
Masahiro Kaneko
Danushka Bollegala
Timothy Baldwin
151
4
0
22 Feb 2024
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation
Kristian Lum
Jacy Reese Anthis
Chirag Nagpal
Alex DÁmour
Alexander D’Amour
347
28
0
20 Feb 2024
A Note on Bias to Complete
Jia Xu
Mona Diab
210
2
0
18 Feb 2024
Semantic Properties of cosine based bias scores for word embeddings
International Conference on Pattern Recognition Applications and Methods (ICPRAM), 2024
Sarah Schröder
Alexander Schulz
Fabian Hinder
Barbara Hammer
178
1
0
27 Jan 2024
The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing
Masahiro Kaneko
Danushka Bollegala
Timothy Baldwin
172
5
0
16 Jan 2024
Understanding the Effect of Model Compression on Social Bias in Large Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Gustavo Gonçalves
Emma Strubell
247
17
0
09 Dec 2023
General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token Level
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Bingkang Shi
Xiaodan Zhang
Dehan Kong
Yulei Wu
Zongzhen Liu
Honglei Lyu
Longtao Huang
AI4CE
230
3
0
23 Nov 2023
Fair Text Classification with Wasserstein Independence
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Thibaud Leteno
Antoine Gourru
Charlotte Laclau
Rémi Emonet
Christophe Gravier
FaML
175
5
0
21 Nov 2023
Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Yusuke Sakai
Hidetaka Kamigaito
Katsuhiko Hayashi
Taro Watanabe
187
2
0
15 Nov 2023
Selecting Shots for Demographic Fairness in Few-Shot Learning with Large Language Models
Carlos Alejandro Aguirre
Kuleen Sasse
Isabel Cachola
Mark Dredze
245
2
0
14 Nov 2023
Evaluating Bias and Fairness in Gender-Neutral Pretrained Vision-and-Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Laura Cabello
Emanuele Bugliarello
Stephanie Brandl
Desmond Elliott
159
8
0
26 Oct 2023
A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models
Yi Zhou
Jose Camacho-Collados
Danushka Bollegala
345
6
0
19 Oct 2023
Co
2
^2
2
PT: Mitigating Bias in Pre-trained Language Models through Counterfactual Contrastive Prompt Tuning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Xiangjue Dong
Ziwei Zhu
Zhuoer Wang
Maria Teleki
James Caverlee
212
15
0
19 Oct 2023
Evaluating Gender Bias of Pre-trained Language Models in Natural Language Inference by Considering All Labels
International Conference on Language Resources and Evaluation (LREC), 2023
Panatchakorn Anantaprayoon
Masahiro Kaneko
Naoaki Okazaki
304
21
0
18 Sep 2023
In-Contextual Gender Bias Suppression for Large Language Models
Findings (Findings), 2023
Daisuke Oba
Masahiro Kaneko
Danushka Bollegala
186
12
0
13 Sep 2023
Bias and Fairness in Large Language Models: A Survey
Computational Linguistics (CL), 2023
Isabel O. Gallegos
Ryan Rossi
Joe Barrow
Md Mehrab Tanjim
Sungchul Kim
Franck Dernoncourt
Tong Yu
Ruiyi Zhang
Nesreen Ahmed
AILaw
278
849
0
02 Sep 2023
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection
Fatma Elsafoury
157
4
0
31 Aug 2023
On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection
Fatma Elsafoury
Stamos Katsigiannis
153
1
0
22 May 2023
Word Embeddings Are Steers for Language Models
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Chi Han
Jialiang Xu
Pengfei Yu
Yi R. Fung
Chenkai Sun
Nan Jiang
Tarek Abdelzaher
Heng Ji
LLMSV
241
61
0
22 May 2023
Solving NLP Problems through Human-System Collaboration: A Discussion-based Approach
Findings (Findings), 2023
Masahiro Kaneko
Graham Neubig
Naoaki Okazaki
242
6
0
19 May 2023
On the Origins of Bias in NLP through the Lens of the Jim Code
Fatma Elsafoury
Gavin Abercrombie
174
5
0
16 May 2023
On the Independence of Association Bias and Empirical Fairness in Language Models
Conference on Fairness, Accountability and Transparency (FAccT), 2023
Laura Cabello
Anna Katrine van Zee
Anders Søgaard
131
35
0
20 Apr 2023
Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
122
12
0
28 Jan 2023
Dissociating language and thought in large language models
Kyle Mahowald
Anna A. Ivanova
I. Blank
Nancy Kanwisher
J. Tenenbaum
Evelina Fedorenko
ELM
ReLM
228
228
0
16 Jan 2023
Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models
Kenan Tang
Hanchun Jiang
AI4CE
145
1
0
26 Nov 2022
MABEL: Attenuating Gender Bias using Textual Entailment Data
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jacqueline He
Mengzhou Xia
C. Fellbaum
Danqi Chen
164
37
0
26 Oct 2022
1