Event-Arguments Extraction Corpus and Modeling using BERT for Arabic

Event-argument extraction is a challenging task, particularly in Arabic due to sparse linguistic resources. To fill this gap, we introduce the \hadath corpus (k tokens) as an extension of Wojood, enriched with event-argument annotations. We used three types of event arguments: , , and , which we annotated as relation types. Our inter-annotator agreement evaluation resulted in score and -score. Additionally, we propose a novel method for event relation extraction using BERT, in which we treat the task as text entailment. This method achieves an -score of . To further evaluate the generalization of our proposed method, we collected and annotated another out-of-domain corpus (about k tokens) called \testNLI and used it as a second test set, on which our approach achieved promising results ( -score). Last but not least, we propose an end-to-end system for event-arguments extraction. This system is implemented as part of SinaTools, and both corpora are publicly available at {\small \url{https://sina.birzeit.edu/wojood}}
View on arXiv