ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.02326
22
0

Something Just Like TRuST : Toxicity Recognition of Span and Target

2 June 2025
Berk Atil
Namrata Sureddy
R. Passonneau
ArXiv (abs)PDFHTML
Main:8 Pages
7 Figures
Bibliography:4 Pages
7 Tables
Appendix:6 Pages
Abstract

Toxicity in online content, including content generated by language models, has become a critical concern due to its potential for negative psychological and social impact. This paper introduces TRuST, a comprehensive dataset designed to improve toxicity detection that merges existing datasets, and has labels for toxicity, target social group, and toxic spans. It includes a diverse range of target groups such as ethnicity, gender, religion, disability, and politics, with both human/machine-annotated and human machine-generated data. We benchmark state-of-the-art large language models (LLMs) on toxicity detection, target group identification, and toxic span extraction. We find that fine-tuned models consistently outperform zero-shot and few-shot prompting, though performance remains low for certain social groups. Further, reasoning capabilities do not significantly improve performance, indicating that LLMs have weak social reasoning skills.

View on arXiv
@article{atil2025_2506.02326,
  title={ Something Just Like TRuST : Toxicity Recognition of Span and Target },
  author={ Berk Atil and Namrata Sureddy and Rebecca J. Passonneau },
  journal={arXiv preprint arXiv:2506.02326},
  year={ 2025 }
}
Comments on this paper