Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.10289
Cited By
HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection
18 December 2020
Binny Mathew
Punyajoy Saha
Seid Muhie Yimam
Chris Biemann
Pawan Goyal
Animesh Mukherjee
Re-assign community
ArXiv
PDF
HTML
Papers citing
"HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection"
50 / 280 papers shown
Title
Hate Speech and Offensive Language Detection in Bengali
Mithun Das
Somnath Banerjee
Punyajoy Saha
Animesh Mukherjee
28
27
0
07 Oct 2022
Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
28
41
0
06 Oct 2022
Time Will Change Things: An Empirical Study on Dynamic Language Understanding in Social Media Classification
Yuji Zhang
Jing Li
43
5
0
06 Oct 2022
Quantifying How Hateful Communities Radicalize Online Users
Matheus Schmitz
Keith Burghardt
Goran Murić
9
12
0
19 Sep 2022
"Dummy Grandpa, do you know anything?": Identifying and Characterizing Ad hominem Fallacy Usage in the Wild
Utkarsh P. Patel
Animesh Mukherjee
Mainack Mondal
14
2
0
05 Sep 2022
Explainable Artificial Intelligence Applications in Cyber Security: State-of-the-Art in Research
Zhibo Zhang
H. A. Hamadi
Ernesto Damiani
C. Yeun
Fatma Taher
AAML
32
148
0
31 Aug 2022
Combating high variance in Data-Scarce Implicit Hate Speech Classification
Debaditya Pal
Kaustubh Chaudhari
Harsh Sharma
25
2
0
29 Aug 2022
Exploring Hate Speech Detection with HateXplain and BERT
Arvind Subramaniam
A. Mehra
Sayani Kundu
18
3
0
09 Aug 2022
ferret: a Framework for Benchmarking Explainers on Transformers
Giuseppe Attanasio
Eliana Pastor
C. Bonaventura
Debora Nozza
33
30
0
02 Aug 2022
Few-shot Adaptation Works with UnpredicTable Data
Jun Shern Chan
Michael Pieler
Jonathan Jao
Jérémy Scheurer
Ethan Perez
36
5
0
01 Aug 2022
An Interpretability Evaluation Benchmark for Pre-trained Language Models
Ya-Ming Shen
Lijie Wang
Ying-Cong Chen
Xinyan Xiao
Jing Liu
Hua Wu
37
4
0
28 Jul 2022
An Additive Instance-Wise Approach to Multi-class Model Interpretation
Vy Vo
Van Nguyen
Trung Le
Quan Hung Tran
Gholamreza Haffari
S. Çamtepe
Dinh Q. Phung
FAtt
48
5
0
07 Jul 2022
Hate Speech Criteria: A Modular Approach to Task-Specific Hate Speech Definitions
Urja Khurana
I. Vermeulen
Eric T. Nalisnick
M. V. Noorloos
Antske Fokkens
AILaw
20
17
0
30 Jun 2022
Which one is more toxic? Findings from Jigsaw Rate Severity of Toxic Comments
M. Das
Punyajoy Saha
Mithun Das
21
8
0
27 Jun 2022
Explainable and High-Performance Hate and Offensive Speech Detection
M. Babaeianjelodar
Gurram Poorna Prudhvi
Stephen Lorenz
Keyu Chen
Sumona Mondal
Soumyabrata Dey
Navin Kumar
11
2
0
26 Jun 2022
KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP
Yufei Wang
Jiayi Zheng
Can Xu
Xiubo Geng
Tao Shen
Chongyang Tao
Daxin Jiang
VLM
MoE
31
2
0
21 Jun 2022
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Paul Röttger
Haitham Seelawi
Debora Nozza
Zeerak Talat
Bertie Vidgen
30
65
0
20 Jun 2022
Codec at SemEval-2022 Task 5: Multi-Modal Multi-Transformer Misogynous Meme Classification Framework
Ahmed M. Mahran
C. Borella
K. Perifanos
14
1
0
14 Jun 2022
Hate Speech and Counter Speech Detection: Conversational Context Does Matter
Xinchen Yu
Eduardo Blanco
Lingzi Hong
14
42
0
13 Jun 2022
Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization
Sarah Masud
Manjot Bedi
Mohammad Aflah Khan
Md. Shad Akhtar
Tanmoy Chakraborty
31
21
0
08 Jun 2022
Challenges in Applying Explainability Methods to Improve the Fairness of NLP Models
Esma Balkir
S. Kiritchenko
I. Nejadgholi
Kathleen C. Fraser
21
36
0
08 Jun 2022
Speech Detection Task Against Asian Hate: BERT the Central, While Data-Centric Studies the Crucial
Xin Lian
19
1
0
05 Jun 2022
Vietnamese Hate and Offensive Detection using PhoBERT-CNN and Social Media Streaming Data
K. Tran
An Trong Nguyen
Phu Gia Hoang
Canh Duc Luu
Trong-Hop Do
Kiet Van Nguyen
22
23
0
01 Jun 2022
Hollywood Identity Bias Dataset: A Context Oriented Bias Analysis of Movie Dialogues
Sandhya Singh
Prapti Roy
Nihar Ranjan Sahoo
Niteesh Mallela
Himanshu Gupta
...
Milind Savagaonkar
Nidhi
Roshni Ramnani
Anutosh Maitra
Shubhashis Sengupta
25
13
0
31 May 2022
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts
Qinyuan Ye
Juan Zha
Xiang Ren
MoE
18
12
0
25 May 2022
ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection
Badr AlKhamissi
Faisal Ladhak
Srini Iyer
Ves Stoyanov
Zornitsa Kozareva
Xian Li
Pascale Fung
Lambert Mathias
Asli Celikyilmaz
Mona T. Diab
48
17
0
25 May 2022
Toxicity Detection with Generative Prompt-based Inference
Yau-Shian Wang
Y. Chang
93
36
0
24 May 2022
KOLD: Korean Offensive Language Dataset
Young-kuk Jeong
Juhyun Oh
Jaimeen Ahn
Jongwon Lee
Jihyung Mon
Sungjoon Park
Alice Oh
57
25
0
23 May 2022
A Fine-grained Interpretability Evaluation Benchmark for Neural NLP
Lijie Wang
Yaozong Shen
Shu-ping Peng
Shuai Zhang
Xinyan Xiao
Hao Liu
Hongxuan Tang
Ying-Cong Chen
Hua Wu
Haifeng Wang
ELM
19
21
0
23 May 2022
Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Antonis Maronikolakis
Philip Baader
Hinrich Schütze
28
9
0
13 May 2022
CounterGeDi: A controllable approach to generate polite, detoxified and emotional counterspeech
Punyajoy Saha
Kanishk Singh
Adarsh Kumar
Binny Mathew
Animesh Mukherjee
16
36
0
09 May 2022
Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection
Esma Balkir
I. Nejadgholi
Kathleen C. Fraser
S. Kiritchenko
FAtt
41
27
0
06 May 2022
GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers
Ali Modarressi
Mohsen Fayyaz
Yadollah Yaghoobzadeh
Mohammad Taher Pilehvar
ViT
27
33
0
06 May 2022
HateCheckHIn: Evaluating Hindi Hate Speech Detection Models
Mithun Das
Punyajoy Saha
Binny Mathew
Animesh Mukherjee
36
17
0
30 Apr 2022
CAVES: A Dataset to facilitate Explainable Classification and Summarization of Concerns towards COVID Vaccines
Soham Poddar
Azlaan Mustafa Samad
Rajdeep Mukherjee
Niloy Ganguly
Saptarshi Ghosh
25
26
0
28 Apr 2022
Data Bootstrapping Approaches to Improve Low Resource Abusive Language Detection for Indic Languages
Mithun Das
Somnath Banerjee
Animesh Mukherjee
11
42
0
26 Apr 2022
A survey on improving NLP models with human explanations
Mareike Hartmann
Daniel Sonntag
LRM
40
21
0
19 Apr 2022
CRUSH: Contextually Regularized and User anchored Self-supervised Hate speech Detection
Souvic Chakraborty
Parag Dutta
Sumegh Roychowdhury
Animesh Mukherjee
6
6
0
13 Apr 2022
Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study
Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
24
43
0
04 Apr 2022
On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations
Yang Trista Cao
Yada Pruksachatkun
Kai-Wei Chang
Rahul Gupta
Varun Kumar
Jwala Dhamala
Aram Galstyan
16
92
0
25 Mar 2022
Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Antonis Maronikolakis
Axel Wisiorek
Leah Nann
Haris Jabbar
Sahana Udupa
Hinrich Schütze
24
24
0
22 Mar 2022
Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
Ning Ding
Yujia Qin
Guang Yang
Fu Wei
Zonghan Yang
...
Jianfei Chen
Yang Liu
Jie Tang
Juan Li
Maosong Sun
34
197
0
14 Mar 2022
Large-Scale Hate Speech Detection with Cross-Domain Transfer
Cagri Toraman
Furkan Şahinuç
E. Yilmaz
32
60
0
02 Mar 2022
Deep Learning for Hate Speech Detection: A Comparative Study
Jitendra Malik
Hezhe Qiao
Guansong Pang
Anton Van Den Hengel
51
44
0
19 Feb 2022
ADIMA: Abuse Detection In Multilingual Audio
Vikram Gupta
Rini A. Sharon
Ramit Sawhney
Debdoot Mukherjee
21
20
0
16 Feb 2022
DermX: an end-to-end framework for explainable automated dermatological diagnosis
Raluca Jalaboi
F. Faye
Mauricio Orbes-Arteaga
D. Jørgensen
Ole Winther
A. Galimzianova
MedIm
19
17
0
14 Feb 2022
HaT5: Hate Language Identification using Text-to-Text Transfer Transformer
Sana Sabah Sabry
Tosin P. Adewumi
Nosheen Abid
Gyorgy Kovács
F. Liwicki
Marcus Liwicki
34
13
0
11 Feb 2022
Hateful Memes Challenge: An Enhanced Multimodal Framework
Aijing Gao
Bingjun Wang
Jiaqi Yin
Yating Tian
21
2
0
20 Dec 2021
Leveraging Transformers for Hate Speech Detection in Conversational Code-Mixed Tweets
Zaki Mustafa Farooqi
Sreyan Ghosh
R. Shah
32
29
0
18 Dec 2021
Reducing Target Group Bias in Hate Speech Detectors
Darsh J. Shah
Sinong Wang
Han Fang
Hao Ma
Luke Zettlemoyer
FaML
31
2
0
07 Dec 2021
Previous
1
2
3
4
5
6
Next