ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.13862
  4. Cited By
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models

A Trip Towards Fairness: Bias and De-Biasing in Large Language Models

23 May 2023
Leonardo Ranaldi
Elena Sofia Ruzzetti
Davide Venditti
Dario Onorati
Fabio Massimo Zanzotto
ArXivPDFHTML

Papers citing "A Trip Towards Fairness: Bias and De-Biasing in Large Language Models"

28 / 28 papers shown
Title
Sensing and Steering Stereotypes: Extracting and Applying Gender Representation Vectors in LLMs
Sensing and Steering Stereotypes: Extracting and Applying Gender Representation Vectors in LLMs
Hannah Cyberey
Yangfeng Ji
David E. Evans
LLMSV
72
1
0
27 Feb 2025
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
103
0
0
17 Feb 2025
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Zeping Yu
Sophia Ananiadou
KELM
43
1
0
24 Jan 2025
Evaluating Gender Bias in Large Language Models
Evaluating Gender Bias in Large Language Models
Michael Döll
Markus Döhring
Andreas Müller
30
1
0
14 Nov 2024
Conformity in Large Language Models
Conformity in Large Language Models
Xiaochen Zhu
Caiqi Zhang
Tom Stafford
Nigel Collier
Andreas Vlachos
46
0
0
16 Oct 2024
Investigating Implicit Bias in Large Language Models: A Large-Scale
  Study of Over 50 LLMs
Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs
Divyanshu Kumar
Umang Jain
Sahil Agarwal
P. Harshangi
37
4
0
13 Oct 2024
The Lou Dataset -- Exploring the Impact of Gender-Fair Language in
  German Text Classification
The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification
Andreas Waldis
Joel Birrer
Anne Lauscher
Iryna Gurevych
25
1
0
26 Sep 2024
'Since Lawyers are Males..': Examining Implicit Gender Bias in Hindi
  Language Generation by LLMs
'Since Lawyers are Males..': Examining Implicit Gender Bias in Hindi Language Generation by LLMs
Ishika Joshi
Ishita Gupta
Adrita Dey
Tapan Parikh
AI4CE
30
2
0
20 Sep 2024
Social Bias in Large Language Models For Bangla: An Empirical Study on
  Gender and Religious Bias
Social Bias in Large Language Models For Bangla: An Empirical Study on Gender and Religious Bias
Jayanta Sadhu
Maneesha Rani Saha
Rifat Shahriyar
37
3
0
03 Jul 2024
Interpreting Bias in Large Language Models: A Feature-Based Approach
Interpreting Bias in Large Language Models: A Feature-Based Approach
Nirmalendu Prakash
Lee Ka Wei Roy
37
1
0
18 Jun 2024
Safeguarding Large Language Models: A Survey
Safeguarding Large Language Models: A Survey
Yi Dong
Ronghui Mu
Yanghao Zhang
Siqi Sun
Tianle Zhang
...
Yi Qi
Jinwei Hu
Jie Meng
Saddek Bensalem
Xiaowei Huang
OffRL
KELM
AILaw
35
17
0
03 Jun 2024
LIDAO: Towards Limited Interventions for Debiasing (Large) Language
  Models
LIDAO: Towards Limited Interventions for Debiasing (Large) Language Models
Tianci Liu
Haoyu Wang
Shiyang Wang
Yu Cheng
Jing Gao
ALM
35
0
0
01 Jun 2024
Spectral Editing of Activations for Large Language Model Alignment
Spectral Editing of Activations for Large Language Model Alignment
Yifu Qiu
Zheng Zhao
Yftah Ziser
Anna Korhonen
E. Ponti
Shay B. Cohen
KELM
LLMSV
28
15
0
15 May 2024
A Survey on Multilingual Large Language Models: Corpora, Alignment, and
  Bias
A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
Yuemei Xu
Ling Hu
Jiayi Zhao
Zihan Qiu
Yuqi Ye
Hanwen Gu
LRM
27
36
0
01 Apr 2024
Reducing Large Language Model Bias with Emphasis on 'Restricted
  Industries': Automated Dataset Augmentation and Prejudice Quantification
Reducing Large Language Model Bias with Emphasis on 'Restricted Industries': Automated Dataset Augmentation and Prejudice Quantification
Devam Mondal
Carlo Lipizzi
16
0
0
20 Mar 2024
Potential and Challenges of Model Editing for Social Debiasing
Potential and Challenges of Model Editing for Social Debiasing
Jianhao Yan
Futing Wang
Yafu Li
Yue Zhang
KELM
62
9
0
21 Feb 2024
Building Guardrails for Large Language Models
Building Guardrails for Large Language Models
Yizhen Dong
Ronghui Mu
Gao Jin
Yi Qi
Jinwei Hu
Xingyu Zhao
Jie Meng
Wenjie Ruan
Xiaowei Huang
OffRL
61
27
0
02 Feb 2024
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language
  Model Systems
Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
Tianyu Cui
Yanling Wang
Chuanpu Fu
Yong Xiao
Sijia Li
...
Junwu Xiong
Xinyu Kong
Zujie Wen
Ke Xu
Qi Li
57
56
0
11 Jan 2024
Tackling Bias in Pre-trained Language Models: Current Trends and
  Under-represented Societies
Tackling Bias in Pre-trained Language Models: Current Trends and Under-represented Societies
Vithya Yogarajan
Gillian Dobbie
Te Taka Keegan
R. Neuwirth
ALM
43
11
0
03 Dec 2023
Debiasing Algorithm through Model Adaptation
Debiasing Algorithm through Model Adaptation
Tomasz Limisiewicz
David Marecek
Tomáš Musil
21
12
0
29 Oct 2023
Zero-shot Faithfulness Evaluation for Text Summarization with Foundation
  Language Model
Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model
Qi Jia
Siyu Ren
Yizhu Liu
Kenny Q. Zhu
ALM
HILM
33
16
0
18 Oct 2023
Bias and Fairness in Large Language Models: A Survey
Bias and Fairness in Large Language Models: A Survey
Isabel O. Gallegos
Ryan A. Rossi
Joe Barrow
Md Mehrab Tanjim
Sungchul Kim
Franck Dernoncourt
Tong Yu
Ruiyi Zhang
Nesreen Ahmed
AILaw
19
486
0
02 Sep 2023
Large Language Models and Knowledge Graphs: Opportunities and Challenges
Large Language Models and Knowledge Graphs: Opportunities and Challenges
Jeff Z. Pan
Simon Razniewski
Jan-Christoph Kalo
Sneha Singhania
Jiaoyan Chen
...
Gerard de Melo
A. Bonifati
Edlira Vakaj
M. Dragoni
D. Graux
KELM
30
73
0
11 Aug 2023
Exploring Linguistic Properties of Monolingual BERTs with Typological
  Classification among Languages
Exploring Linguistic Properties of Monolingual BERTs with Typological Classification among Languages
Elena Sofia Ruzzetti
Federico Ranaldi
F. Logozzo
Michele Mastromattei
Leonardo Ranaldi
Fabio Massimo Zanzotto
19
8
0
03 May 2023
Debiasing Pre-trained Contextualised Embeddings
Debiasing Pre-trained Contextualised Embeddings
Masahiro Kaneko
Danushka Bollegala
215
138
0
23 Jan 2021
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
214
616
0
03 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
242
31,257
0
16 Jan 2013
1