Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.09456
Cited By
StereoSet: Measuring stereotypical bias in pretrained language models
20 April 2020
Moin Nadeem
Anna Bethke
Siva Reddy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"StereoSet: Measuring stereotypical bias in pretrained language models"
50 / 158 papers shown
Title
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias
Brandon Smith
Mohamed Reda Bouadjenek
Tahsin Alamgir Kheya
Phillip Dawson
S. Aryal
ALM
ELM
26
0
0
14 May 2025
Bridging AI and Carbon Capture: A Dataset for LLMs in Ionic Liquids and CBE Research
Gaurab Sarkar
Sougata Saha
30
0
0
11 May 2025
Emotions in the Loop: A Survey of Affective Computing for Emotional Support
Karishma Hegde
Hemadri Jayalath
27
0
0
02 May 2025
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
Zhiting Fan
Ruizhe Chen
Zuozhu Liu
44
0
0
30 Apr 2025
Mind the Language Gap: Automated and Augmented Evaluation of Bias in LLMs for High- and Low-Resource Languages
Alessio Buscemi
Cedric Lothritz
Sergio Morales
Marcos Gomez-Vazquez
Robert Clarisó
Jordi Cabot
German Castignani
31
0
0
19 Apr 2025
Gender and content bias in Large Language Models: a case study on Google Gemini 2.0 Flash Experimental
Roberto Balestri
42
0
0
18 Mar 2025
Gender Encoding Patterns in Pretrained Language Model Representations
Mahdi Zakizadeh
Mohammad Taher Pilehvar
48
0
0
09 Mar 2025
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
103
0
0
17 Feb 2025
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs
Angelina Wang
Michelle Phan
Daniel E. Ho
Sanmi Koyejo
54
2
0
04 Feb 2025
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Zeping Yu
Sophia Ananiadou
KELM
43
1
0
24 Jan 2025
Owls are wise and foxes are unfaithful: Uncovering animal stereotypes in vision-language models
Tabinda Aman
Mohammad Nadeem
S. Sohail
Mohammad Anas
Erik Cambria
VLM
59
1
0
21 Jan 2025
BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages
Junho Myung
Nayeon Lee
Yi Zhou
Jiho Jin
Rifki Afina Putri
...
Seid Muhie Yimam
Mohammad Taher Pilehvar
N. Ousidhoum
Jose Camacho-Collados
Alice H. Oh
92
34
0
17 Jan 2025
LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education
Iain Xie Weissburg
Sathvika Anand
Sharon Levy
Haewon Jeong
68
2
0
17 Oct 2024
Bias Similarity Across Large Language Models
Hyejun Jeong
Shiqing Ma
Amir Houmansadr
54
0
0
15 Oct 2024
No Free Lunch: Retrieval-Augmented Generation Undermines Fairness in LLMs, Even for Vigilant Users
Mengxuan Hu
Hongyi Wu
Zihan Guan
Ronghang Zhu
Dongliang Guo
Daiqing Qi
Sheng Li
SILM
38
3
0
10 Oct 2024
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
87
1
0
09 Oct 2024
Collapsed Language Models Promote Fairness
Jingxuan Xu
Wuyang Chen
Linyi Li
Yao Zhao
Yunchao Wei
44
0
0
06 Oct 2024
The Lou Dataset -- Exploring the Impact of Gender-Fair Language in German Text Classification
Andreas Waldis
Joel Birrer
Anne Lauscher
Iryna Gurevych
28
1
0
26 Sep 2024
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models
Kunsheng Tang
Wenbo Zhou
Jie Zhang
Aishan Liu
Gelei Deng
Shuai Li
Peigui Qi
Weiming Zhang
Tianwei Zhang
Nenghai Yu
46
3
0
22 Aug 2024
Does Liking Yellow Imply Driving a School Bus? Semantic Leakage in Language Models
Hila Gonen
Terra Blevins
Alisa Liu
Luke Zettlemoyer
Noah A. Smith
31
5
0
12 Aug 2024
Bringing AI Participation Down to Scale: A Comment on Open AIs Democratic Inputs to AI Project
David Moats
Chandrima Ganguly
VLM
40
0
0
16 Jul 2024
Are Large Language Models Really Bias-Free? Jailbreak Prompts for Assessing Adversarial Robustness to Bias Elicitation
Riccardo Cantini
Giada Cosenza
A. Orsino
Domenico Talia
AAML
57
5
0
11 Jul 2024
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang
Peng Wang
Tong Zhou
Yushun Dong
Zhen Tan
Jundong Li
CoGe
56
7
0
02 Jul 2024
Exploring Safety-Utility Trade-Offs in Personalized Language Models
Anvesh Rao Vijjini
Somnath Basu Roy Chowdhury
Snigdha Chaturvedi
51
6
0
17 Jun 2024
Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?
Haozhe An
Christabel Acquaye
Colin Wang
Zongxia Li
Rachel Rudinger
36
12
0
15 Jun 2024
Towards Understanding Task-agnostic Debiasing Through the Lenses of Intrinsic Bias and Forgetfulness
Guangliang Liu
Milad Afshari
Xitong Zhang
Zhiyu Xue
Avrajit Ghosh
Bidhan Bashyal
Rongrong Wang
K. Johnson
27
0
0
06 Jun 2024
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models
Jisu Shin
Hoyun Song
Huije Lee
Soyeong Jeong
Jong C. Park
38
6
0
06 Jun 2024
Uncovering Bias in Large Vision-Language Models at Scale with Counterfactuals
Phillip Howard
Kathleen C. Fraser
Anahita Bhiwandiwalla
S. Kiritchenko
52
9
0
30 May 2024
GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction
Virginia K. Felkner
Jennifer A. Thompson
Jonathan May
41
9
0
24 May 2024
MBIAS: Mitigating Bias in Large Language Models While Retaining Context
Shaina Raza
Ananya Raval
Veronica Chatrath
48
6
0
18 May 2024
Quite Good, but Not Enough: Nationality Bias in Large Language Models -- A Case Study of ChatGPT
Shucheng Zhu
Weikang Wang
Ying Liu
37
5
0
11 May 2024
Are Models Biased on Text without Gender-related Language?
Catarina G Belém
P. Seshadri
Yasaman Razeghi
Sameer Singh
38
8
0
01 May 2024
Identifying Fairness Issues in Automatically Generated Testing Content
Kevin Stowe
Benny Longwill
Alyssa Francis
Tatsuya Aoyama
Debanjan Ghosh
Swapna Somasundaran
46
1
0
23 Apr 2024
AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence
Minbeom Kim
Hwanhee Lee
Joonsuk Park
Hwaran Lee
Kyomin Jung
37
1
0
18 Apr 2024
REQUAL-LM: Reliability and Equity through Aggregation in Large Language Models
Sana Ebrahimi
N. Shahbazi
Abolfazl Asudeh
37
1
0
17 Apr 2024
GenEARL: A Training-Free Generative Framework for Multimodal Event Argument Role Labeling
Hritik Bansal
Po-Nien Kung
P. Brantingham
Weisheng Wang
Miao Zheng
VLM
34
1
0
07 Apr 2024
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said
Yejin Bang
Delong Chen
Nayeon Lee
Pascale Fung
32
25
0
27 Mar 2024
Revisiting The Classics: A Study on Identifying and Rectifying Gender Stereotypes in Rhymes and Poems
Aditya Narayan Sankaran
Vigneshwaran Shankaran
Sampath Lonka
Rajesh Sharma
32
0
0
18 Mar 2024
Evaluating LLMs for Gender Disparities in Notable Persons
L. Rhue
Sofie Goethals
Arun Sundararajan
52
4
0
14 Mar 2024
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution
Flor Miriam Plaza del Arco
A. C. Curry
Alba Curry
Gavin Abercrombie
Dirk Hovy
34
24
0
05 Mar 2024
LLM-Assisted Content Conditional Debiasing for Fair Text Embedding
Wenlong Deng
Blair Chen
Beidi Zhao
Chiyu Zhang
Xiaoxiao Li
Christos Thrampoulidis
35
0
0
22 Feb 2024
COBIAS: Assessing the Contextual Reliability of Bias Benchmarks for Language Models
Priyanshul Govil
Hemang Jain
Vamshi Krishna Bonagiri
Aman Chadha
Ponnurangam Kumaraguru
Manas Gaur
Sanorita Dey
53
2
0
22 Feb 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
29
1
0
21 Feb 2024
Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation
Kristian Lum
Jacy Reese Anthis
Chirag Nagpal
Alex DÁmour
Alexander D’Amour
31
13
0
20 Feb 2024
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Fengqing Jiang
Zhangchen Xu
Luyao Niu
Zhen Xiang
Bhaskar Ramasubramanian
Bo Li
Radha Poovendran
47
86
0
19 Feb 2024
Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation
Jessica Quaye
Alicia Parrish
Oana Inel
Charvi Rastogi
Hannah Rose Kirk
...
Nathan Clement
Rafael Mosquera
Juan Ciro
Vijay Janapa Reddi
Lora Aroyo
31
7
0
14 Feb 2024
Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
Timothy Baldwin
LRM
37
27
0
28 Jan 2024
Quantifying Stereotypes in Language
Yang Liu
38
1
0
28 Jan 2024
Open Models, Closed Minds? On Agents Capabilities in Mimicking Human Personalities through Open Large Language Models
Lucio La Cava
Andrea Tagarelli
LLMAG
AI4CE
63
13
0
13 Jan 2024
Multilingual large language models leak human stereotypes across language boundaries
Yang Trista Cao
Anna Sotnikova
Jieyu Zhao
Linda X. Zou
Rachel Rudinger
Hal Daumé
PILM
33
10
0
12 Dec 2023
1
2
3
4
Next