BD at BEA 2025 Shared Task: MPNet Ensembles for Pedagogical Mistake Identification and Localization in AI Tutor Responses

2 June 2025

Main:9 Pages

8 Figures

Bibliography:2 Pages

5 Tables

Appendix:2 Pages

Abstract

We present Team BD's submission to the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors, under Track 1 (Mistake Identification) and Track 2 (Mistake Location). Both tracks involve three-class classification of tutor responses in educational dialogues - determining if a tutor correctly recognizes a student's mistake (Track 1) and whether the tutor pinpoints the mistake's location (Track 2). Our system is built on MPNet, a Transformer-based language model that combines BERT and XLNet's pre-training advantages. We fine-tuned MPNet on the task data using a class-weighted cross-entropy loss to handle class imbalance, and leveraged grouped cross-validation (10 folds) to maximize the use of limited data while avoiding dialogue overlap between training and validation. We then performed a hard-voting ensemble of the best models from each fold, which improves robustness and generalization by combining multiple classifiers. Our approach achieved strong results on both tracks, with exact-match macro-F1 scores of approximately 0.7110 for Mistake Identification and 0.5543 for Mistake Location on the official test set. We include comprehensive analysis of our system's performance, including confusion matrices and t-SNE visualizations to interpret classifier behavior, as well as a taxonomy of common errors with examples. We hope our ensemble-based approach and findings provide useful insights for designing reliable tutor response evaluation systems in educational dialogue settings.

View on arXiv

@article{rohan2025_2506.01817,
  title={ BD at BEA 2025 Shared Task: MPNet Ensembles for Pedagogical Mistake Identification and Localization in AI Tutor Responses },
  author={ Shadman Rohan and Ishita Sur Apan and Muhtasim Ibteda Shochcho and Md Fahim and Mohammad Ashfaq Ur Rahman and AKM Mahbubur Rahman and Amin Ahsan Ali },
  journal={arXiv preprint arXiv:2506.01817},
  year={ 2025 }
}

Comments on this paper