AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics

28 August 2023

Papers citing "AI in the Gray: Exploring Moderation Policies in Dialogic Large Language Models vs. Human Answers in Controversial Topics"

3 / 3 papers shown

Title
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models Mats Faulborn Indira Sen Max Pellert Andreas Spitz David Garcia ELM 45 0 0 20 Mar 2025
HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes Xuanyu Su Yansong Li Diana Inkpen Nathalie Japkowicz VLM 81 2 0 11 Aug 2024
Testing AI on language comprehension tasks reveals insensitivity to underlying meaning Vittoria Dentella Fritz Guenther Elliot Murphy G. Marcus Evelina Leivada ELM 40 26 0 23 Feb 2023