We introduce SCRum-9, a multilingual dataset for Rumour Stance Classification, containing 7,516 tweet-reply pairs from X. SCRum-9 goes beyond existing stance classification datasets by covering more languages (9), linking examples to more fact-checked claims (2.1k), and including complex annotations from multiple annotators to account for intra- and inter-annotator variability. Annotations were made by at least three native speakers per language, totalling around 405 hours of annotation and 8,150 dollars in compensation. Experiments on SCRum-9 show that it is a challenging benchmark for both state-of-the-art LLMs (e.g. Deepseek) as well as fine-tuned pre-trained models, motivating future work in this area.
View on arXiv@article{li2025_2505.18916, title={ SCRum-9: Multilingual Stance Classification over Rumours on Social Media }, author={ Yue Li and Jake Vasilakes and Zhixue Zhao and Carolina Scarton }, journal={arXiv preprint arXiv:2505.18916}, year={ 2025 } }