Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.15761
Cited By
Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection
31 December 2020
Bertie Vidgen
Tristan Thrush
Zeerak Talat
Douwe Kiela
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning from the Worst: Dynamically Generated Datasets to Improve Online Hate Detection"
26 / 76 papers shown
Title
Assessing Language Model Deployment with Risk Cards
Leon Derczynski
Hannah Rose Kirk
Vidhisha Balachandran
Sachin Kumar
Yulia Tsvetkov
M. Leiser
Saif Mohammad
28
42
0
31 Mar 2023
SemEval-2023 Task 10: Explainable Detection of Online Sexism
Hannah Rose Kirk
Wenjie Yin
Bertie Vidgen
Paul Röttger
21
117
0
07 Mar 2023
A Federated Approach for Hate Speech Detection
Jay Gala
Deep Gandhi
Jash Mehta
Zeerak Talat
21
4
0
18 Feb 2023
Cross-Reality Re-Rendering: Manipulating between Digital and Physical Realities
Siddhartha Datta
33
0
0
15 Nov 2022
NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?
Saadia Gabriel
Hamid Palangi
Yejin Choi
AAML
42
1
0
08 Nov 2022
Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering
Helena Bonaldi
Sara Dellantonio
Serra Sinem Tekiroğlu
Marco Guerini
29
41
0
07 Nov 2022
BotsTalk: Machine-sourced Framework for Automatic Curation of Large-scale Multi-skill Dialogue Datasets
Minju Kim
Chaehyeong Kim
Yongho Song
Seung-won Hwang
Jinyoung Yeo
39
13
0
23 Oct 2022
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages
Paul Röttger
Debora Nozza
Federico Bianchi
Dirk Hovy
29
10
0
20 Oct 2022
The State of Profanity Obfuscation in Natural Language Processing
Debora Nozza
Dirk Hovy
42
7
0
14 Oct 2022
Explainable Abuse Detection as Intent Classification and Slot Filling
Agostina Calabrese
Bjorn Ross
Mirella Lapata
36
10
0
06 Oct 2022
Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection
Tulika Bose
Nikolaos Aletras
Irina Illina
Dominique Fohr
19
0
0
18 Sep 2022
Increasing Adverse Drug Events extraction robustness on social media: case study on negation and speculation
Simone Scaboro
Beatrice Portelli
Emmanuele Chersoni
Enrico Santus
G. Serra
24
5
0
06 Sep 2022
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Paul Röttger
Haitham Seelawi
Debora Nozza
Zeerak Talat
Bertie Vidgen
30
65
0
20 Jun 2022
Adversarial Text Normalization
Joanna Bitton
Maya Pavlova
Ivan Evtimov
AAML
22
2
0
08 Jun 2022
Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection
Indira Sen
Mattia Samory
Claudia Wagner
Isabelle Augenstein
26
17
0
09 May 2022
Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study
Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
24
43
0
04 Apr 2022
Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection
Tulika Bose
Nikolaos Aletras
Irina Illina
Dominique Fohr
45
5
0
23 Mar 2022
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
Thomas Hartvigsen
Saadia Gabriel
Hamid Palangi
Maarten Sap
Dipankar Ray
Ece Kamar
33
350
0
17 Mar 2022
Reducing Target Group Bias in Hate Speech Detectors
Darsh J. Shah
Sinong Wang
Han Fang
Hao Ma
Luke Zettlemoyer
FaML
25
2
0
07 Dec 2021
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
Maarten Sap
Swabha Swayamdipta
Laura Vianna
Xuhui Zhou
Yejin Choi
Noah A. Smith
46
267
0
15 Nov 2021
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
223
374
0
15 Oct 2021
Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
Emily Dinan
Gavin Abercrombie
A. S. Bergman
Shannon L. Spruit
Dirk Hovy
Y-Lan Boureau
Verena Rieser
43
105
0
07 Jul 2021
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation
Max Bartolo
Tristan Thrush
Robin Jia
Sebastian Riedel
Pontus Stenetorp
Douwe Kiela
AAML
22
103
0
18 Apr 2021
HateCheck: Functional Tests for Hate Speech Detection Models
Paul Röttger
B. Vidgen
Dong Nguyen
Zeerak Talat
Helen Z. Margetts
J. Pierrehumbert
31
259
0
31 Dec 2020
A Framework for the Computational Linguistic Analysis of Dehumanization
Julia Mendelsohn
Yulia Tsvetkov
Dan Jurafsky
87
89
0
06 Mar 2020
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
Mor Geva
Yoav Goldberg
Jonathan Berant
242
320
0
21 Aug 2019
Previous
1
2