26
43

Detecting Hateful Memes Using a Multimodal Deep Ensemble

Abstract

While significant progress has been made using machine learning algorithms to detect hate speech, important technical challenges still remain to be solved in order to bring their performance closer to human accuracy. We investigate several of the most recent visual-linguistic Transformer architectures and propose improvements to increase their performance for this task. The proposed model outperforms the baselines by a large margin and ranks 5th^{th} on the leaderboard out of 3,100+ participants.

View on arXiv
Comments on this paper