ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.20898
60
0

A database to support the evaluation of gender biases in GPT-4o output

28 February 2025
Luise Mehner
Lena Alicija Philine Fiedler
Sabine Ammon
Dorothea Kolossa
ArXivPDFHTML
Abstract

The widespread application of Large Language Models (LLMs) involves ethical risks for users and societies. A prominent ethical risk of LLMs is the generation of unfair language output that reinforces or exacerbates harm for members of disadvantaged social groups through gender biases (Weidinger et al., 2022; Bender et al., 2021; Kotek et al., 2023). Hence, the evaluation of the fairness of LLM outputs with respect to such biases is a topic of rising interest. To advance research in this field, promote discourse on suitable normative bases and evaluation methodologies, and enhance the reproducibility of related studies, we propose a novel approach to database construction. This approach enables the assessment of gender-related biases in LLM-generated language beyond merely evaluating their degree of neutralization.

View on arXiv
@article{mehner2025_2502.20898,
  title={ A database to support the evaluation of gender biases in GPT-4o output },
  author={ Luise Mehner and Lena Alicija Philine Fiedler and Sabine Ammon and Dorothea Kolossa },
  journal={arXiv preprint arXiv:2502.20898},
  year={ 2025 }
}
Comments on this paper