Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.10289
Cited By
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection
18 December 2020
Binny Mathew
Punyajoy Saha
Seid Muhie Yimam
Chris Biemann
Pawan Goyal
Animesh Mukherjee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection"
50 / 280 papers shown
Title
Centering the Margins: Outlier-Based Identification of Harmed Populations in Toxicity Detection
Vyoma Raman
Eve Fleisig
Dan Klein
27
0
0
24 May 2023
Towards Legally Enforceable Hate Speech Detection for Public Forums
Chunyan Luo
R. Bhambhoria
Xiao-Dan Zhu
Samuel Dahan
AILaw
33
5
0
23 May 2023
On Bias and Fairness in NLP: Investigating the Impact of Bias and Debiasing in Language Models on the Fairness of Toxicity Detection
Fatma Elsafoury
Stamos Katsigiannis
32
1
0
22 May 2023
Transferring Fairness using Multi-Task Learning with Limited Demographic Information
Carlos Alejandro Aguirre
Mark Dredze
35
0
0
22 May 2023
Analyzing Norm Violations in Live-Stream Chat
Jihyung Moon
Dong-Ho Lee
Hyundong Justin Cho
Woojeong Jin
Chan Young Park
MinWoo Kim
Jonathan May
Jay Pujara
Sungjoon Park
23
4
0
18 May 2023
Rethinking Multimodal Content Moderation from an Asymmetric Angle with Mixed-modality
Jialing Yuan
Ye Yu
Gaurav Mittal
Matthew Hall
Sandra Sajeev
Mei Chen
27
9
0
17 May 2023
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
Shangbin Feng
Chan Young Park
Yuhan Liu
Yulia Tsvetkov
33
228
0
15 May 2023
Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks
Junyu Lu
Bo Xu
Xiaokun Zhang
C. Min
Liang Yang
Hongfei Lin
25
27
0
08 May 2023
Stanford MLab at SemEval-2023 Task 10: Exploring GloVe- and Transformer-Based Methods for the Explainable Detection of Online Sexism
Hee Jung Choi
Trevor Chow
Aaron Wan
Hong Meng Yam
Swetha Yogeswaran
Beining Zhou
36
1
0
07 May 2023
HQP: A Human-Annotated Dataset for Detecting Online Propaganda
Abdurahman Maarouf
Dominik Bär
Dominique Geissler
Stefan Feuerriegel
25
9
0
28 Apr 2023
Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection
Martin Wessel
Tomávs Horych
Terry Ruas
Akiko Aizawa
Bela Gipp
Timo Spinde
32
21
0
25 Apr 2023
"HOT" ChatGPT: The promise of ChatGPT in detecting and discriminating hateful, offensive, and toxic comments on social media
Lingyao Li
Lizhou Fan
Shubham Atreja
Libby Hemphill
AI4MH
47
84
0
20 Apr 2023
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks
Antonis Maronikolakis
Abdullatif Köksal
Hinrich Schütze
43
0
0
04 Apr 2023
Hate Speech Targets Detection in Parler using BERT
Nadav Schneider
Shimon Shouei
Saleem Ghantous
Elad Feldman
23
4
0
03 Apr 2023
Mitigating Source Bias for Fairer Weak Supervision
Changho Shin
Sonia Cromp
Dyah Adila
Frederic Sala
29
2
0
30 Mar 2023
On the rise of fear speech in online social media
Punyajoy Saha
Kiran Garimella
Narla Komal Kalyan
Saurabh Kumar Pandey
Pauras Mangesh Meher
Binny Mathew
Animesh Mukherjee
9
22
0
18 Mar 2023
NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models
Yiran Ye
Thai Le
Dongwon Lee
AAML
DeLMO
41
0
0
18 Mar 2023
SemEval-2023 Task 10: Explainable Detection of Online Sexism
Hannah Rose Kirk
Wenjie Yin
Bertie Vidgen
Paul Röttger
24
117
0
07 Mar 2023
IFAN: An Explainability-Focused Interaction Framework for Humans and NLP Models
Edoardo Mosca
Daryna Dementieva
Tohid Ebrahim Ajdari
Maximilian Kummeth
Kirill Gringauz
Yutong Zhou
Georg Groh
24
8
0
06 Mar 2023
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network
Sreyan Ghosh
Manan Suri
Purva Chiniya
Utkarsh Tyagi
Sonal Kumar
Dinesh Manocha
27
13
0
02 Mar 2023
Towards Safer Generative Language Models: A Survey on Safety Risks, Evaluations, and Improvements
Jiawen Deng
Jiale Cheng
Hao Sun
Zhexin Zhang
Minlie Huang
LM&MA
ELM
34
16
0
18 Feb 2023
Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?
Chengwei Qin
Q. Li
Ruochen Zhao
Chenyu You
VLM
LRM
23
15
0
16 Feb 2023
MTTM: Metamorphic Testing for Textual Content Moderation Software
Wenxuan Wang
Jen-tse Huang
Weibin Wu
Jianping Zhang
Yizhan Huang
Shuqing Li
Pinjia He
Michael Lyu
58
30
0
11 Feb 2023
Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive
Tharindu Cyril Weerasooriya
Sujan Dutta
Tharindu Ranasinghe
Marcos Zampieri
Christopher Homan
Ashiqur R. KhudaBukhsh
AAML
41
20
0
29 Jan 2023
Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim?
Shivam Sharma
Atharva Kulkarni
Tharun Suresh
Himanshi Mathur
Preslav Nakov
Md. Shad Akhtar
Tanmoy Chakraborty
38
15
0
26 Jan 2023
Qualitative Analysis of a Graph Transformer Approach to Addressing Hate Speech: Adapting to Dynamically Changing Content
Liam Hebert
Hong Chen
R. Cohen
Lukasz Golab
15
4
0
25 Jan 2023
ViHOS: Hate Speech Spans Detection for Vietnamese
Phu Gia Hoang
Canh Duc Luu
K. Tran
Kiet Van Nguyen
Ngan Luu-Thuy Nguyen
31
20
0
24 Jan 2023
Rationalizing Predictions by Adversarial Information Calibration
Lei Sha
Oana-Maria Camburu
Thomas Lukasiewicz
27
4
0
15 Jan 2023
Predicting Hateful Discussions on Reddit using Graph Transformer Networks and Communal Context
Liam Hebert
Lukasz Golab
R. Cohen
11
8
0
10 Jan 2023
Perplexed by Quality: A Perplexity-based Method for Adult and Harmful Content Detection in Multilingual Heterogeneous Web Data
Timm Jansen
Yangling Tong
V. Zevallos
Pedro Ortiz Suarez
22
17
0
20 Dec 2022
Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Ana Kotarcic
Dominik Hangartner
Fabrizio Gilardi
Selina Kurer
K. Donnay
24
2
0
05 Dec 2022
SOLD: Sinhala Offensive Language Dataset
Tharindu Ranasinghe
Isuri Anuradha
Damith Premasiri
Kanishka Silva
Hansi Hettiarachchi
Lasitha Uyangodage
Marcos Zampieri
41
8
0
01 Dec 2022
Rationale-Guided Few-Shot Classification to Detect Abusive Language
Punyajoy Saha
Divyanshu Sheth
Kushal Kedia
Binny Mathew
Animesh Mukherjee
9
3
0
30 Nov 2022
ConceptX: A Framework for Latent Concept Analysis
Firoj Alam
Fahim Dalvi
Nadir Durrani
Hassan Sajjad
A. Khan
Jia Xu
33
5
0
12 Nov 2022
Cross-Platform and Cross-Domain Abusive Language Detection with Supervised Contrastive Learning
Md. Tawkat Islam Khondaker
Muhammad Abdul-Mageed
L. Lakshmanan
20
1
0
11 Nov 2022
How Much Hate with #china? A Preliminary Analysis on China-related Hateful Tweets Two Years After the Covid Pandemic Began
Jinghua Xu
Zarah Weiß
40
1
0
11 Nov 2022
NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?
Saadia Gabriel
Hamid Palangi
Yejin Choi
AAML
45
1
0
08 Nov 2022
Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering
Helena Bonaldi
Sara Dellantonio
Serra Sinem Tekiroğlu
Marco Guerini
29
42
0
07 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection
Jiyun Kim
Byounghan Lee
Kyung-ah Sohn
29
13
0
01 Nov 2022
XMD: An End-to-End Framework for Interactive Explanation-Based Debugging of NLP Models
Dong-Ho Lee
Akshen Kadakia
Brihi Joshi
Aaron Chan
Ziyi Liu
...
Takashi Shibuya
Ryosuke Mitani
Toshiyuki Sekiya
Jay Pujara
Xiang Ren
LRM
40
9
0
30 Oct 2022
System Demo: Tool and Infrastructure for Offensive Language Error Analysis (OLEA) in English
M. Grace
XajavionJaySeabrum
Dananjay Srinivas
Alexis Palmer
45
0
0
28 Oct 2022
ExPUNations: Augmenting Puns with Keywords and Explanations
Jiao Sun
Anjali Narayan-Chen
Shereen Oraby
Alessandra Cervone
Tagyoung Chung
Jing Huang
Yang Liu
Nanyun Peng
19
10
0
24 Oct 2022
Different Tunes Played with Equal Skill: Exploring a Unified Optimization Subspace for Delta Tuning
Jing Yi
Weize Chen
Yujia Qin
Yankai Lin
Ning Ding
Xu Han
Zhiyuan Liu
Maosong Sun
Jie Zhou
20
2
0
24 Oct 2022
On the Transformation of Latent Space in Fine-Tuned NLP Models
Nadir Durrani
Hassan Sajjad
Fahim Dalvi
Firoj Alam
32
18
0
23 Oct 2022
How Hate Speech Varies by Target Identity: A Computational Analysis
Michael Miller Yoder
Lynnette Hui Xian Ng
D. W. Brown
Kathleen M. Carley
33
20
0
19 Oct 2022
NormSAGE: Multi-Lingual Multi-Cultural Norm Discovery from Conversations On-the-Fly
Yi R. Fung
Tuhin Chakraborty
Hao Guo
Owen Rambow
Smaranda Muresan
Heng Ji
21
39
0
16 Oct 2022
T5 for Hate Speech, Augmented Data and Ensemble
Tosin P. Adewumi
Sana Sabah Sabry
Nosheen Abid
F. Liwicki
Marcus Liwicki
11
10
0
11 Oct 2022
Empowering the Fact-checkers! Automatic Identification of Claim Spans on Twitter
Megha Sundriyal
Atharva Kulkarni
Vaibhav Pulastya
Md. Shad Akhtar
Tanmoy Chakraborty
MedIm
28
18
0
10 Oct 2022
A Survey of Methods for Addressing Class Imbalance in Deep-Learning Based Natural Language Processing
Sophie Henning
William H. Beluch
Alexander Fraser
Annemarie Friedrich
22
20
0
10 Oct 2022
Quantitative Metrics for Evaluating Explanations of Video DeepFake Detectors
Federico Baldassarre
Quentin Debard
Gonzalo Fiz Pontiveros
Tri Kurniawan Wijaya
44
4
0
07 Oct 2022
Previous
1
2
3
4
5
6
Next