ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.17739
57
1

Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection

22 March 2025
Chatrine Qwaider
Bashar Alhafni
Kirill Chirkunov
Nizar Habash
Ted Briscoe
ArXivPDFHTML
Abstract

Automated Essay Scoring (AES) plays a crucial role in assessing language learners' writing quality, reducing grading workload, and providing real-time feedback. Arabic AES systems are particularly challenged by the lack of annotated essay datasets. This paper presents a novel framework leveraging Large Language Models (LLMs) and Transformers to generate synthetic Arabic essay datasets for AES. We prompt an LLM to generate essays across CEFR proficiency levels and introduce controlled error injection using a fine-tuned Standard Arabic BERT model for error type prediction. Our approach produces realistic human-like essays, contributing a dataset of 3,040 annotated essays. Additionally, we develop a BERT-based auto-marking system for accurate and scalable Arabic essay evaluation. Experimental results demonstrate the effectiveness of our framework in improving Arabic AES performance.

View on arXiv
@article{qwaider2025_2503.17739,
  title={ Enhancing Arabic Automated Essay Scoring with Synthetic Data and Error Injection },
  author={ Chatrine Qwaider and Bashar Alhafni and Kirill Chirkunov and Nizar Habash and Ted Briscoe },
  journal={arXiv preprint arXiv:2503.17739},
  year={ 2025 }
}
Comments on this paper