Social Bias Frames: Reasoning about Social and Power Implications of Language

10 November 2019

Dan Jurafsky

Yejin Choi

Papers citing "Social Bias Frames: Reasoning about Social and Power Implications of Language"

50 / 95 papers shown

Title
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models Zhiting Fan Ruizhe Chen Zuozhu Liu 44 0 0 30 Apr 2025
Evaluation and Facilitation of Online Discussions in the LLM Era: A Survey Katerina Korre Dimitris Tsirmpas Nikos Gkoumas Emma Cabalé Danai Myrtzani Theodoros Evgeniou Ion Androutsopoulos Ion Androutsopoulos 40 1 0 03 Mar 2025
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks Jing Yang Max Glockner Anderson de Rezende Rocha Iryna Gurevych LRM 73 1 0 07 Feb 2025
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs Angelina Wang Michelle Phan Daniel E. Ho Sanmi Koyejo 54 2 0 04 Feb 2025
The Goofus & Gallant Story Corpus for Practical Value Alignment Md Sultan al Nahian Tasmia Tasrin Spencer Frazier Mark O. Riedl Brent Harrison 50 0 0 17 Jan 2025
Towards Efficient and Explainable Hate Speech Detection via Model Distillation Paloma Piot Javier Parapar 83 173 0 18 Dec 2024
ClarityEthic: Explainable Moral Judgment Utilizing Contrastive Ethical Insights from Large Language Models Yuxi Sun Wei Gao Jing Ma Hongzhan Lin Ziyang Luo Wenxuan Zhang ELM 82 0 0 17 Dec 2024
Smaller Large Language Models Can Do Moral Self-Correction Guangliang Liu Zhiyu Xue Rongrong Wang K. Johnson Kristen Marie Johnson LRM 32 0 0 30 Oct 2024
LLMScan: Causal Scan for LLM Misbehavior Detection Mengdi Zhang Kai Kiat Goh Peixin Zhang Jun Sun Rose Lin Xin Hongyu Zhang 25 0 0 22 Oct 2024
Epistemological Bias As a Means for the Automated Detection of Injustices in Text Kenya Andrews Lamogha Chiazor 30 0 0 08 Jul 2024
Does Cross-Cultural Alignment Change the Commonsense Morality of Language Models? Yuu Jinnai 49 1 0 24 Jun 2024
AustroTox: A Dataset for Target-Based Austrian German Offensive Language Detection Pia Pachinger Janis Goldzycher A. Planitzer Wojciech Kusa Allan Hanbury Julia Neidhardt 47 2 0 12 Jun 2024
Hire Me or Not? Examining Language Model's Behavior with Occupation Attributes Damin Zhang Yi Zhang Geetanjali Bihani Julia Taylor Rayz 53 2 0 06 May 2024
PejorativITy: Disambiguating Pejorative Epithets to Improve Misogyny Detection in Italian Tweets Arianna Muti Federico Ruggeri Cagri Toraman Lorenzo Musetti Samuel Algherini Silvia Ronchi G. Saretto Caterina Zapparoli Alberto Barrón-Cedeño 23 3 0 03 Apr 2024
Target Span Detection for Implicit Harmful Content Nazanin Jafari James Allan Sheikh Muhammad Sarwar 43 1 0 28 Mar 2024
Rectifying Demonstration Shortcut in In-Context Learning Joonwon Jang Sanghwan Jang Wonbin Kweon Minjin Jeon Hwanjo Yu 37 1 0 14 Mar 2024
Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models Jiang Zhang Qiong Wu Yiming Xu Cheng Cao Zheng Du Konstantinos Psounis 33 14 0 13 Dec 2023
Cross Fertilizing Empathy from Brain to Machine as a Value Alignment Strategy Devin Gonier Adrian Adduci Cassidy LoCascio 29 0 0 10 Dec 2023
Interpretation modeling: Social grounding of sentences by reasoning over their implicit moral judgments Liesbeth Allein Maria Mihaela Trucscva Marie-Francine Moens 33 1 0 27 Nov 2023
MOKA: Moral Knowledge Augmentation for Moral Event Extraction Xinliang Frederick Zhang Winston Wu Nick Beauchamp Lu Wang 35 7 0 16 Nov 2023
Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language Jimin Mun Emily Allaway Akhila Yerukola Laura Vianna Sarah-Jane Leslie Maarten Sap 16 22 0 31 Oct 2023
Improving Few-shot Generalization of Safety Classifiers via Data Augmented Parameter-Efficient Fine-Tuning Ananth Balashankar Xiao Ma Aradhana Sinha Ahmad Beirami Yao Qin Jilin Chen Alex Beutel 24 2 0 25 Oct 2023
STREAM: Social data and knowledge collective intelligence platform for TRaining Ethical AI Models Yuwei Wang Enmeng Lu Zizhe Ruan Yao Liang Yi Zeng AI4TS 29 4 0 09 Oct 2023
Adding guardrails to advanced chatbots Yanchen Wang Lisa Singh AI4MH 17 7 0 13 Jun 2023
CL-UZH at SemEval-2023 Task 10: Sexism Detection through Incremental Fine-Tuning and Multi-Task Learning with Label Descriptions Janis Goldzycher 18 1 0 06 Jun 2023
NormBank: A Knowledge Bank of Situational Social Norms Caleb Ziems Jane Dwivedi-Yu Yi-Chia Wang A. Halevy Diyi Yang 23 41 0 26 May 2023
Do LLMs Understand Social Knowledge? Evaluating the Sociability of Large Language Models with SocKET Benchmark Minje Choi Jiaxin Pei Sagar Kumar Chang Shu David Jurgens ALM LLMAG 35 69 0 24 May 2023
TalkUp: Paving the Way for Understanding Empowering Language Lucille Njoo Chan Young Park Octavia Stappart Marvin Thielk Yi Chu Yulia Tsvetkov 16 3 0 23 May 2023
BiasX: "Thinking Slow" in Toxic Content Moderation with Explanations of Implied Social Biases Yiming Zhang Sravani Nanduri Liwei Jiang Tongshuang Wu Maarten Sap 47 7 0 23 May 2023
SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations Jesus Solano Oana-Maria Camburu Pasquale Minervini 20 1 0 22 May 2023
Comparing Biases and the Impact of Multilingual Training across Multiple Languages Sharon Levy Neha Ann John Ling Liu Yogarshi Vyas Jie Ma Yoshinari Fujinuma Miguel Ballesteros Vittorio Castelli Dan Roth 26 25 0 18 May 2023
PaLM 2 Technical Report Rohan Anil Andrew M. Dai Orhan Firat Melvin Johnson Dmitry Lepikhin ... Ce Zheng Wei Zhou Denny Zhou Slav Petrov Yonghui Wu ReLM LRM 116 1,148 0 17 May 2023
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks Eve Fleisig Rediet Abebe Dan Klein 34 43 0 11 May 2023
Are Human Explanations Always Helpful? Towards Objective Evaluation of Human Natural Language Explanations Bingsheng Yao Prithviraj Sen Lucian Popa James A. Hendler Dakuo Wang XAI ELM FAtt 25 10 0 04 May 2023
Understanding and Predicting Human Label Variation in Natural Language Inference through Explanation Nan-Jiang Jiang Chenhao Tan M. Marneffe 32 2 0 24 Apr 2023
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks Antonis Maronikolakis Abdullatif Köksal Hinrich Schütze 43 0 0 04 Apr 2023
Towards Countering Essentialism through Social Bias Reasoning Emily Allaway Nina Taneja Sarah-Jane Leslie Maarten Sap 19 4 0 28 Mar 2023
Natural Language Reasoning, A Survey Fei Yu Hongbo Zhang Prayag Tiwari Benyou Wang ReLM LRM 49 51 0 26 Mar 2023
SemEval-2023 Task 10: Explainable Detection of Online Sexism Hannah Rose Kirk Wenjie Yin Bertie Vidgen Paul Röttger 24 117 0 07 Mar 2023
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network Sreyan Ghosh Manan Suri Purva Chiniya Utkarsh Tyagi Sonal Kumar Dinesh Manocha 27 12 0 02 Mar 2023
The Capacity for Moral Self-Correction in Large Language Models Deep Ganguli Amanda Askell Nicholas Schiefer Thomas I. Liao Kamil.e Lukovsiut.e ... Tom B. Brown C. Olah Jack Clark Sam Bowman Jared Kaplan LRM ReLM 45 158 0 15 Feb 2023
Bipol: Multi-axes Evaluation of Bias with Explainability in Benchmark Datasets Tosin P. Adewumi Isabella Sodergren Lama Alkhaled Sana Sabah Sabry F. Liwicki Marcus Liwicki 35 4 0 28 Jan 2023
Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim? Shivam Sharma Atharva Kulkarni Tharun Suresh Himanshi Mathur Preslav Nakov Md. Shad Akhtar Tanmoy Chakraborty 38 15 0 26 Jan 2023
Bike Frames: Understanding the Implicit Portrayal of Cyclists in the News Xingmeng Zhao Dan Schumacher Sashank Nalluri Xavier Walton Suhana Shrestha Anthony Rios 26 2 0 15 Jan 2023
Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information Ruyuan Wan Jaehyung Kim Dongyeop Kang 14 36 0 12 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits Ruibo Liu Chenyan Jia Ge Zhang Ziyu Zhuang Tony X. Liu Soroush Vosoughi 99 35 0 01 Jan 2023
Leveraging World Knowledge in Implicit Hate Speech Detection Jessica Lin 21 6 0 28 Dec 2022
CREPE: Open-Domain Question Answering with False Presuppositions Xinyan Velocity Yu Sewon Min Luke Zettlemoyer Hannaneh Hajishirzi 16 45 0 30 Nov 2022
Casual Conversations v2: Designing a large consent-driven dataset to measure algorithmic bias and robustness C. Hazirbas Yejin Bang Tiezheng Yu Parisa Assar Bilal Porgali ... Jacqueline Pan Emily McReynolds Miranda Bogen Pascale Fung Cristian Canton Ferrer 29 8 0 10 Nov 2022
EvEntS ReaLM: Event Reasoning of Entity States via Language Models Evangelia Spiliopoulou Artidoro Pagnoni Yonatan Bisk Eduard H. Hovy LRM ReLM 32 10 0 10 Nov 2022