Detecting Hateful Memes Using a Multimodal Deep Ensemble

24 December 2020

Abstract

While significant progress has been made using machine learning algorithms to detect hate speech, important technical challenges still remain to be solved in order to bring their performance closer to human accuracy. We investigate several of the most recent visual-linguistic Transformer architectures and propose improvements to increase their performance for this task. The proposed model outperforms the baselines by a large margin and ranks 5 $^{th}$ on the leaderboard out of 3,100+ participants.

View on arXiv

Comments on this paper