An Image is Worth a Thousand Toxic Words: A Metamorphic Testing
Framework for Content Moderation Software

An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software

18 August 2023

Michael R. Lyu

Papers citing "An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software"

4 / 4 papers shown

Title
The Earth is Flat? Unveiling Factual Errors in Large Language Models Wenxuan Wang Juluan Shi Zhaopeng Tu Youliang Yuan Jen-tse Huang Wenxiang Jiao Michael R. Lyu KELM HILM SyDa 47 1 0 01 Jan 2024
Validating Multimedia Content Moderation Software via Semantic Fusion Wenxuan Wang Jingyuan Huang Chang Chen Jiazhen Gu Jianping Zhang Weibin Wu Pinjia He Michael Lyu 73 9 0 23 May 2023
Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-based Hate Hannah Rose Kirk B. Vidgen Paul Röttger Tristan Thrush Scott A. Hale 67 57 0 12 Aug 2021
Adversarial examples in the physical world Alexey Kurakin Ian Goodfellow Samy Bengio SILM AAML 287 5,837 0 08 Jul 2016