Persistent Anti-Muslim Bias in Large Language Models

14 January 2021

Papers citing "Persistent Anti-Muslim Bias in Large Language Models"

50 / 295 papers shown

Title
Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations Somnath Banerjee Pratyush Chatterjee Shanu Kumar Sayan Layek Parag Agrawal Rima Hazra Animesh Mukherjee AAML 14 0 0 20 May 2025
From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling Mohsinul Kabir Tasfia Tahsin Sophia Ananiadou KELM AI4CE 12 0 0 18 May 2025
A Comprehensive Analysis of Large Language Model Outputs: Similarity, Diversity, and Bias Brandon Smith Mohamed Reda Bouadjenek Tahsin Alamgir Kheya Phillip Dawson S. Aryal ALM ELM 44 0 0 14 May 2025
WorldView-Bench: A Benchmark for Evaluating Global Cultural Perspectives in Large Language Models Abdullah Mushtaq Imran Taj Rafay Naeem Ibrahim Ghaznavi Junaid Qadir 30 0 0 14 May 2025
Developing A Framework to Support Human Evaluation of Bias in Generated Free Response Text Jennifer Healey Laurie Byrum Md Nadeem Akhtar Surabhi Bhargava Moumita Sinha 36 0 0 05 May 2025
Biased by Design: Leveraging AI Biases to Enhance Critical Thinking of News Readers L. Zavolokina Kilian Sprenkamp Zoya Katashinskaya Daniel Gordon Jones 41 0 0 20 Apr 2025
DICE: A Framework for Dimensional and Contextual Evaluation of Language Models Aryan Shrivastava Paula Akemi Aoyagui 38 0 0 14 Apr 2025
An Evaluation of Cultural Value Alignment in LLM Nicholas Sukiennik Chen Gao Fengli Xu Yongqian Li 29 0 0 11 Apr 2025
Benchmarking Adversarial Robustness to Bias Elicitation in Large Language Models: Scalable Automated Assessment with LLM-as-a-Judge Riccardo Cantini A. Orsino Massimo Ruggiero Domenico Talia AAML ELM 50 1 0 10 Apr 2025
NLP Security and Ethics, in the Wild Heather Lent Erick Galinkin Yiyi Chen Jens Myrup Pedersen Leon Derczynski Johannes Bjerva SILM 52 0 0 09 Apr 2025
Through the LLM Looking Glass: A Socratic Probing of Donkeys, Elephants, and Markets Molly Kennedy Ayyoob Imani Timo Spinde Hinrich Schütze 55 1 0 20 Mar 2025
Implicit Bias-Like Patterns in Reasoning Models Messi H.J. Lee Calvin K. Lai LRM 61 0 0 14 Mar 2025
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model Qiyuan Deng X. Bai Kehai Chen Yaowei Wang Liqiang Nie Min Zhang OffRL 71 0 0 13 Mar 2025
Exploring Bias in over 100 Text-to-Image Generative Models J. Vice Naveed Akhtar Richard I. Hartley Ajmal Mian EGVM 69 3 0 11 Mar 2025
Exposing Product Bias in LLM Investment Recommendation Yuhan Zhi Xiaoyu Zhang Longtian Wang Shumin Jiang Shiqing Ma Xiaohong Guan Chao Shen 68 0 0 11 Mar 2025
Uncovering Gaps in How Humans and LLMs Interpret Subjective Language Erik Jones Arjun Patrawala Jacob Steinhardt 49 0 0 06 Mar 2025
Human Preferences for Constructive Interactions in Language Model Alignment Yara Kyrychenko Jon Roozenbeek Brandon Davidson S. V. D. Linden Ramit Debnath 46 0 0 05 Mar 2025
Analyzing the Safety of Japanese Large Language Models in Stereotype-Triggering Prompts Akito Nakanishi Yukie Sano Geng Liu Francesco Pierri 60 0 0 03 Mar 2025
Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models Alberto Purpura Sahil Wadhwa Jesse Zymet Akshay Gupta Andy Luo Melissa Kazemi Rad Swapnil Shinde Mohammad Sorower AAML 272 0 0 03 Mar 2025
Variance reduction in output from generative AI Yu Xie Yueqi Xie 46 0 0 02 Mar 2025
More of the Same: Persistent Representational Harms Under Increased Representation Jennifer Mickel Maria De-Arteaga Leqi Liu Kevin Tian 44 0 0 01 Mar 2025
C3AI: Crafting and Evaluating Constitutions for Constitutional AI Yara Kyrychenko Ke Zhou Edyta Bogucka Daniele Quercia ELM 55 3 0 21 Feb 2025
Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books Sangmitra Madhusudan Robert D Morabito Skye Reid Nikta Gohari Sadr Ali Emami 69 1 0 07 Feb 2025
Owls are wise and foxes are unfaithful: Uncovering animal stereotypes in vision-language models Tabinda Aman Mohammad Nadeem S. Sohail Mohammad Anas Min Zhang VLM 66 1 0 21 Jan 2025
Enhancing Patient-Centric Communication: Leveraging LLMs to Simulate Patient Perspectives Xinyao Ma Rui Zhu Zihao Wang Jingwei Xiong Qingyu Chen Haixu Tang L. Jean Camp Lucila Ohno-Machado LM&MA 53 0 0 12 Jan 2025
Gender-Neutral Large Language Models for Medical Applications: Reducing Bias in PubMed Abstracts Elizabeth Schaefer Kirk Roberts 41 0 0 10 Jan 2025
Toward Inclusive Educational AI: Auditing Frontier LLMs through a Multiplexity Lens Abdullah Mushtaq Muhammad Rafay Naeem Muhammad Imran Taj Ibrahim Ghaznavi Junaid Qadir 47 2 0 08 Jan 2025
Explicit vs. Implicit: Investigating Social Bias in Large Language Models through Self-Reflection Yachao Zhao Bo Wang Yan Wang 52 2 0 04 Jan 2025
Implicit Priors Editing in Stable Diffusion via Targeted Token Adjustment Feng He Chao Zhang Zhixue Zhao 94 0 0 04 Dec 2024
Towards Resource Efficient and Interpretable Bias Mitigation in Large Language Models S. Tong Eliott Zemour Rawisara Lohanimit Lalana Kagal 70 0 0 02 Dec 2024
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats Jiaxin Wen Vivek Hebbar Caleb Larson Aryan Bhatt Ansh Radhakrishnan ... Shi Feng He He Ethan Perez Buck Shlegeris Akbir Khan AAML 88 8 0 26 Nov 2024
RV4Chatbot: Are Chatbots Allowed to Dream of Electric Sheep? A. Gatti Viviana Mascardi Angelo Ferrando 61 0 0 21 Nov 2024
On the Fairness, Diversity and Reliability of Text-to-Image Generative Models J. Vice Naveed Akhtar Richard I. Hartley Ajmal Mian EGVM 73 0 0 21 Nov 2024
BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices Anka Reuel Amelia F. Hardy Chandler Smith Max Lamparth Malcolm Hardy Mykel J. Kochenderfer ELM 98 18 0 20 Nov 2024
Bias in Large Language Models: Origin, Evaluation, and Mitigation Yufei Guo Muzhe Guo Juntao Su Zhou Yang Mengqiu Zhu Hongfei Li Mengyang Qiu Shuo Shuo Liu AILaw 33 11 0 16 Nov 2024
Distribution Learning with Valid Outputs Beyond the Worst-Case Nick Rittler Kamalika Chaudhuri 29 0 0 21 Oct 2024
With a Grain of SALT: Are LLMs Fair Across Social Dimensions? Samee Arif Zohaib Khan Agha Ali Raza Awais Athar 41 0 0 16 Oct 2024
Evaluating Gender Bias of LLMs in Making Morality Judgements Divij Bajaj Yuanyuan Lei Jonathan Tong Ruihong Huang 37 3 0 13 Oct 2024
Generative AI and Perceptual Harms: Who's Suspected of using LLMs? Kowe Kadoma D. Metaxa Mor Naaman 39 3 0 01 Oct 2024
Mitigating Propensity Bias of Large Language Models for Recommender Systems Guixian Zhang Guan Yuan Debo Cheng Lin Liu Jiuyong Li Shichao Zhang 47 2 0 30 Sep 2024
'Simulacrum of Stories': Examining Large Language Models as Qualitative Research Participants Shivani Kapania William Agnew Motahhare Eslami Hoda Heidari Sarah E Fox 47 4 0 28 Sep 2024
AI Can Be Cognitively Biased: An Exploratory Study on Threshold Priming in LLM-Based Batch Relevance Assessment Nuo Chen Jiqun Liu Xiaoyu Dong Qijiong Liu Tetsuya Sakai Xiao-Ming Wu 37 10 0 24 Sep 2024
STOP! Benchmarking Large Language Models with Sensitivity Testing on Offensive Progressions Robert D Morabito Sangmitra Madhusudan Tyler McDonald Ali Emami 36 0 0 20 Sep 2024
AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances Dhruv Agarwal Mor Naaman Aditya Vashistha 43 16 0 17 Sep 2024
Identity-related Speech Suppression in Generative AI Content Moderation Oghenefejiro Isaacs Anigboro Charlie M. Crawford Danaë Metaxa Sorelle A. Friedler Sorelle A. Friedler 26 0 0 09 Sep 2024
More is More: Addition Bias in Large Language Models Luca Santagata Cristiano De Nobili 33 1 0 04 Sep 2024
It is Time to Develop an Auditing Framework to Promote Value Aware Chatbots Yanchen Wang Lisa Singh 31 1 0 03 Sep 2024
WeQA: A Benchmark for Retrieval Augmented Generation in Wind Energy Domain Rounak Meyur Hung Phan S. Wagle Jan Strube M. Halappanavar Sameera Horawalavithana Anurag Acharya Sai Munikoti 25 1 0 21 Aug 2024
Towards "Differential AI Psychology" and in-context Value-driven Statement Alignment with Moral Foundations Theory Simon Münker SyDa 32 0 0 21 Aug 2024
Strategic Demonstration Selection for Improved Fairness in LLM In-Context Learning Jingyu Hu Weiru Liu Mengnan Du 30 2 0 19 Aug 2024