ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.04105
  4. Cited By
Measuring Implicit Bias in Explicitly Unbiased Large Language Models

Measuring Implicit Bias in Explicitly Unbiased Large Language Models

6 February 2024
Xuechunzi Bai
Angelina Wang
Ilia Sucholutsky
Thomas Griffiths
ArXivPDFHTML

Papers citing "Measuring Implicit Bias in Explicitly Unbiased Large Language Models"

13 / 13 papers shown
Title
Trust Me, I Can Handle It: Self-Generated Adversarial Scenario Extrapolation for Robust Language Models
Trust Me, I Can Handle It: Self-Generated Adversarial Scenario Extrapolation for Robust Language Models
Md Rafi Ur Rashid
Vishnu Asutosh Dasu
Ye Wang
Gang Tan
Shagufta Mehnaz
AAML
ELM
81
0
0
20 May 2025
Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts
Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts
Elizabeth Schaefer
Kirk Roberts
140
0
0
10 Jan 2025
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving Testing
Han Jiang
Xiaoyuan Yi
Zhihua Wei
Ziang Xiao
Shu Wang
Xing Xie
ELM
ALM
127
8
0
20 Jun 2024
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in
  LLM-Generated Reference Letters
"Kelly is a Warm Person, Joseph is a Role Model": Gender Biases in LLM-Generated Reference Letters
Yixin Wan
George Pu
Jiao Sun
Aparna Garimella
Kai-Wei Chang
Nanyun Peng
84
186
0
13 Oct 2023
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
  Models
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Wei Ping
Weixin Chen
Hengzhi Pei
Chulin Xie
Mintong Kang
...
Zinan Lin
Yuk-Kit Cheng
Sanmi Koyejo
D. Song
Yue Liu
91
414
0
20 Jun 2023
"I'm fully who I am": Towards Centering Transgender and Non-Binary
  Voices to Measure Biases in Open Language Generation
"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Anaelia Ovalle
Palash Goyal
Jwala Dhamala
Zachary Jaggers
Kai-Wei Chang
Aram Galstyan
R. Zemel
Rahul Gupta
55
65
0
17 May 2023
Using cognitive psychology to understand GPT-3
Using cognitive psychology to understand GPT-3
Marcel Binz
Eric Schulz
ELM
LLMAG
324
476
0
21 Jun 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
837
12,893
0
04 Mar 2022
Process for Adapting Language Models to Society (PALMS) with
  Values-Targeted Datasets
Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets
Irene Solaiman
Christy Dennison
85
224
0
18 Jun 2021
Persistent Anti-Muslim Bias in Large Language Models
Persistent Anti-Muslim Bias in Large Language Models
Abubakar Abid
Maheen Farooqi
James Zou
AILaw
104
551
0
14 Jan 2021
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Su Lin Blodgett
Solon Barocas
Hal Daumé
Hanna M. Wallach
155
1,236
0
28 May 2020
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
271
642
0
03 Sep 2019
Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes
Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes
Nikhil Garg
L. Schiebinger
Dan Jurafsky
James Zou
AI4TS
69
960
0
22 Nov 2017
1