Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.11708
Cited By
Detect and Perturb: Neutral Rewriting of Biased and Sensitive Text via Gradient-based Decoding
24 September 2021
Zexue He
Bodhisattwa Prasad Majumder
Julian McAuley
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Detect and Perturb: Neutral Rewriting of Biased and Sensitive Text via Gradient-based Decoding"
21 / 21 papers shown
Title
LFTF: Locating First and Then Fine-Tuning for Mitigating Gender Bias in Large Language Models
Zhanyue Qin
Yue Ding
Deyuan Liu
Qingbin Liu
Junxian Cai
Xi Chen
Zhiying Tu
Dianhui Chu
Cuiyun Gao
Dianbo Sui
14
0
0
21 May 2025
DeCAP: Context-Adaptive Prompt Generation for Debiasing Zero-shot Question Answering in Large Language Models
Suyoung Bae
YunSeok Choi
Jee-Hyong Lee
51
0
0
25 Mar 2025
Mitigating Gender Bias in Code Large Language Models via Model Editing
Zhanyue Qin
Haochuan Wang
Zecheng Wang
Deyuan Liu
Cunhang Fan
Zhao Lv
Zhiying Tu
Dianhui Chu
Dianbo Sui
KELM
28
2
0
10 Oct 2024
Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas
Chengyuan Deng
Yiqun Duan
Xin Jin
Heng Chang
Yijun Tian
...
Kuofeng Gao
Sihong He
Jun Zhuang
Lu Cheng
Haohan Wang
AILaw
48
16
0
08 Jun 2024
Hire Me or Not? Examining Language Model's Behavior with Occupation Attributes
Damin Zhang
Yi Zhang
Geetanjali Bihani
Julia Taylor Rayz
56
2
0
06 May 2024
Potential and Challenges of Model Editing for Social Debiasing
Jianhao Yan
Futing Wang
Yafu Li
Yue Zhang
KELM
73
9
0
21 Feb 2024
A Note on Bias to Complete
Jia Xu
Mona Diab
56
2
0
18 Feb 2024
Analyzing Sentiment Polarity Reduction in News Presentation through Contextual Perturbation and Large Language Models
Alapan Kuila
Somnath Jena
Sudeshna Sarkar
P. Chakrabarti
AAML
19
1
0
03 Feb 2024
The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias
Timo Spinde
Smilla Hinterreiter
Fabian Haak
Terry Ruas
Helge Giese
Norman Meuschke
Bela Gipp
27
12
0
26 Dec 2023
Tackling Bias in Pre-trained Language Models: Current Trends and Under-represented Societies
Vithya Yogarajan
Gillian Dobbie
Te Taka Keegan
R. Neuwirth
ALM
54
11
0
03 Dec 2023
MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation
Zexue He
Yu Wang
An Yan
Yao Liu
Eric Y. Chang
Amilcare Gentili
Julian McAuley
Chun-Nan Hsu
ELM
89
14
0
21 Oct 2023
Bias and Fairness in Large Language Models: A Survey
Isabel O. Gallegos
Ryan A. Rossi
Joe Barrow
Md Mehrab Tanjim
Sungchul Kim
Franck Dernoncourt
Tong Yu
Ruiyi Zhang
Nesreen Ahmed
AILaw
40
498
0
02 Sep 2023
Targeted Data Generation: Finding and Fixing Model Weaknesses
Zexue He
Marco Tulio Ribeiro
Fereshte Khani
29
14
0
28 May 2023
"Nothing Abnormal": Disambiguating Medical Reports via Contrastive Knowledge Infusion
Zexue He
An Yan
Amilcare Gentili
Julian McAuley
Chun-Nan Hsu
MedIm
35
2
0
15 May 2023
Synthetic Pre-Training Tasks for Neural Machine Translation
Zexue He
Graeme W. Blackwood
Yikang Shen
Julian McAuley
Rogerio Feris
29
3
0
19 Dec 2022
Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition
Shuguang Chen
Leonardo Neves
Thamar Solorio
57
3
0
14 Oct 2022
Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey
Sachin Kumar
Vidhisha Balachandran
Lucille Njoo
Antonios Anastasopoulos
Yulia Tsvetkov
ELM
81
86
0
14 Oct 2022
Controlling Bias Exposure for Fair Interpretable Predictions
Zexue He
Yu Wang
Julian McAuley
Bodhisattwa Prasad Majumder
27
19
0
14 Oct 2022
InterFair: Debiasing with Natural Language Feedback for Fair Interpretable Predictions
Bodhisattwa Prasad Majumder
Zexue He
Julian McAuley
21
6
0
14 Oct 2022
Text Style Transfer for Bias Mitigation using Masked Language Modeling
E. Tokpo
T. Calders
24
30
0
21 Jan 2022
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
218
137
0
23 Jan 2021
1