Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.09456
Cited By
StereoSet: Measuring stereotypical bias in pretrained language models
20 April 2020
Moin Nadeem
Anna Bethke
Siva Reddy
Re-assign community
ArXiv
PDF
HTML
Papers citing
"StereoSet: Measuring stereotypical bias in pretrained language models"
50 / 159 papers shown
Title
Multilingual large language models leak human stereotypes across language boundaries
Yang Trista Cao
Anna Sotnikova
Jieyu Zhao
Linda X. Zou
Rachel Rudinger
Hal Daumé
PILM
33
10
0
12 Dec 2023
JAB: Joint Adversarial Prompting and Belief Augmentation
Ninareh Mehrabi
Palash Goyal
Anil Ramakrishna
Jwala Dhamala
Shalini Ghosh
Richard Zemel
Kai-Wei Chang
Aram Galstyan
Rahul Gupta
AAML
33
7
0
16 Nov 2023
Alquist 5.0: Dialogue Trees Meet Generative Models. A Novel Approach for Enhancing SocialBot Conversations
Ondrej Kobza
Jan Cuhel
Tommaso Gargiani
David Herel
Petr Marek
24
3
0
24 Oct 2023
Identifying and Adapting Transformer-Components Responsible for Gender Bias in an English Language Model
Abhijith Chintam
Rahel Beloch
Willem H. Zuidema
Michael Hanna
Oskar van der Wal
28
16
0
19 Oct 2023
Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting
Melanie Sclar
Yejin Choi
Yulia Tsvetkov
Alane Suhr
38
304
0
17 Oct 2023
Emerging Challenges in Personalized Medicine: Assessing Demographic Effects on Biomedical Question Answering Systems
Sagi Shaier
Kevin Bennett
Lawrence E Hunter
K. Wense
29
0
0
16 Oct 2023
Beyond Testers' Biases: Guiding Model Testing with Knowledge Bases using LLMs
Chenyang Yang
Rishabh Rustogi
Rachel A. Brower-Sinning
Grace A. Lewis
Christian Kastner
Tongshuang Wu
KELM
35
11
0
14 Oct 2023
OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs
Patrick Haller
Ansar Aynetdinov
Alan Akbik
33
24
0
07 Sep 2023
Bias Testing and Mitigation in LLM-based Code Generation
Dong Huang
Qingwen Bu
Jie M. Zhang
Xiaofei Xie
Junjie Chen
Heming Cui
48
20
0
03 Sep 2023
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection
Fatma Elsafoury
29
3
0
31 Aug 2023
FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models
Yanhong Bai
Jiabao Zhao
Jinxin Shi
Tingjiang Wei
Xingjiao Wu
Liangbo He
36
0
0
21 Aug 2023
A Survey on Fairness in Large Language Models
Yingji Li
Mengnan Du
Rui Song
Xin Wang
Ying Wang
ALM
52
59
0
20 Aug 2023
The Unequal Opportunities of Large Language Models: Revealing Demographic Bias through Job Recommendations
A. Salinas
Parth Vipul Shah
Yuzhong Huang
Robert McCormack
Fred Morstatter
31
33
0
03 Aug 2023
Power-up! What Can Generative Models Do for Human Computation Workflows?
Garrett Allen
Gaole He
U. Gadiraju
44
3
0
05 Jul 2023
Prompt Tuning Pushes Farther, Contrastive Learning Pulls Closer: A Two-Stage Approach to Mitigate Social Biases
Yingji Li
Mengnan Du
Xin Wang
Ying Wang
53
26
0
04 Jul 2023
An Empirical Analysis of Parameter-Efficient Methods for Debiasing Pre-Trained Language Models
Zhongbin Xie
Thomas Lukasiewicz
26
12
0
06 Jun 2023
Are Diffusion Models Vision-And-Language Reasoners?
Benno Krojer
Elinor Poole-Dayan
Vikram S. Voleti
Christopher Pal
Siva Reddy
45
13
0
25 May 2023
Uncovering and Quantifying Social Biases in Code Generation
Yong-Jin Liu
Xiaokang Chen
Yan Gao
Zhe Su
Fengji Zhang
Daoguang Zan
Jian-Guang Lou
Pin-Yu Chen
Tsung-Yi Ho
36
19
0
24 May 2023
An Efficient Multilingual Language Model Compression through Vocabulary Trimming
Asahi Ushio
Yi Zhou
Jose Camacho-Collados
41
7
0
24 May 2023
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
Tarek Naous
Michael Joseph Ryan
Alan Ritter
Wei-ping Xu
37
85
0
23 May 2023
A Trip Towards Fairness: Bias and De-Biasing in Large Language Models
Leonardo Ranaldi
Elena Sofia Ruzzetti
Davide Venditti
Dario Onorati
Fabio Massimo Zanzotto
35
34
0
23 May 2023
BiasX: "Thinking Slow" in Toxic Content Moderation with Explanations of Implied Social Biases
Yiming Zhang
Sravani Nanduri
Liwei Jiang
Tongshuang Wu
Maarten Sap
47
7
0
23 May 2023
Language-Agnostic Bias Detection in Language Models with Bias Probing
Abdullatif Köksal
Omer F. Yalcin
Ahmet Akbiyik
M. Kilavuz
Anna Korhonen
Hinrich Schütze
38
1
0
22 May 2023
Should We Attend More or Less? Modulating Attention for Fairness
A. Zayed
Gonçalo Mordido
Samira Shabanian
Sarath Chandar
37
10
0
22 May 2023
This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models
Seraphina Goldfarb-Tarrant
Eddie L. Ungless
Esma Balkir
Su Lin Blodgett
37
9
0
22 May 2023
Comparing Biases and the Impact of Multilingual Training across Multiple Languages
Sharon Levy
Neha Ann John
Ling Liu
Yogarshi Vyas
Jie Ma
Yoshinari Fujinuma
Miguel Ballesteros
Vittorio Castelli
Dan Roth
26
25
0
18 May 2023
Semantically Aligned Task Decomposition in Multi-Agent Reinforcement Learning
Wenhao Li
Dan Qiao
Baoxiang Wang
Xiangfeng Wang
Bo Jin
H. Zha
38
5
0
18 May 2023
ChatGPT Perpetuates Gender Bias in Machine Translation and Ignores Non-Gendered Pronouns: Findings across Bengali and Five other Low-Resource Languages
Sourojit Ghosh
Aylin Caliskan
33
69
0
17 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Percy Liang
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
56
177
0
17 May 2023
Scratch Copilot Evaluation: Assessing AI-Assisted Creative Coding for Families
Stefania Druga
Nancy Otero
AI4Ed
30
5
0
17 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages
Aleksandar Petrov
Emanuele La Malfa
Philip Torr
Adel Bibi
25
97
0
17 May 2023
On the Origins of Bias in NLP through the Lens of the Jim Code
Fatma Elsafoury
Gavin Abercrombie
44
4
0
16 May 2023
Constructing Holistic Measures for Social Biases in Masked Language Models
Yang Liu
Yuexian Hou
18
0
0
12 May 2023
Evaluation of Social Biases in Recent Large Pre-Trained Models
Swapnil Sharma
Nikita Anand
V. KranthiKiranG.
Alind Jain
26
0
0
13 Apr 2023
Assessing Language Model Deployment with Risk Cards
Leon Derczynski
Hannah Rose Kirk
Vidhisha Balachandran
Sachin Kumar
Yulia Tsvetkov
M. Leiser
Saif Mohammad
28
42
0
31 Mar 2023
An Overview on Language Models: Recent Developments and Outlook
Chengwei Wei
Yun Cheng Wang
Bin Wang
C.-C. Jay Kuo
25
42
0
10 Mar 2023
Logic Against Bias: Textual Entailment Mitigates Stereotypical Sentence Reasoning
Hongyin Luo
James R. Glass
NAI
26
7
0
10 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
29
507
0
07 Mar 2023
In-Depth Look at Word Filling Societal Bias Measures
Matúš Pikuliak
Ivana Benová
Viktor Bachratý
26
9
0
24 Feb 2023
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Kai Greshake
Sahar Abdelnabi
Shailesh Mishra
C. Endres
Thorsten Holz
Mario Fritz
SILM
49
436
0
23 Feb 2023
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
48
194
0
16 Feb 2023
The Capacity for Moral Self-Correction in Large Language Models
Deep Ganguli
Amanda Askell
Nicholas Schiefer
Thomas I. Liao
Kamil.e Lukovsiut.e
...
Tom B. Brown
C. Olah
Jack Clark
Sam Bowman
Jared Kaplan
LRM
ReLM
45
158
0
15 Feb 2023
Guiding Pretraining in Reinforcement Learning with Large Language Models
Yuqing Du
Olivia Watkins
Zihan Wang
Cédric Colas
Trevor Darrell
Pieter Abbeel
Abhishek Gupta
Jacob Andreas
LM&Ro
21
174
0
13 Feb 2023
Counter-GAP: Counterfactual Bias Evaluation through Gendered Ambiguous Pronouns
Zhongbin Xie
Vid Kocijan
Thomas Lukasiewicz
Oana-Maria Camburu
10
2
0
11 Feb 2023
Nationality Bias in Text Generation
Pranav Narayanan Venkit
Sanjana Gautam
Ruchi Panchanadikar
Ting-Hao 'Kenneth' Huang
Shomir Wilson
33
51
0
05 Feb 2023
FineDeb: A Debiasing Framework for Language Models
Akash Saravanan
Dhruv Mullick
Habibur Rahman
Nidhi Hegde
FedML
AI4CE
20
4
0
05 Feb 2023
Debiasing Vision-Language Models via Biased Prompts
Ching-Yao Chuang
Varun Jampani
Yuanzhen Li
Antonio Torralba
Stefanie Jegelka
VLM
30
96
0
31 Jan 2023
Bipol: Multi-axes Evaluation of Bias with Explainability in Benchmark Datasets
Tosin P. Adewumi
Isabella Sodergren
Lama Alkhaled
Sana Sabah Sabry
F. Liwicki
Marcus Liwicki
35
4
0
28 Jan 2023
SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings
Jan Engler
Sandipan Sikdar
Marlene Lutz
M. Strohmaier
32
7
0
11 Jan 2023
Trustworthy Social Bias Measurement
Rishi Bommasani
Percy Liang
27
10
0
20 Dec 2022
Previous
1
2
3
4
Next