ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.06591
35
0

Evaluating LLM-Generated Q&A Test: a Student-Centered Study

10 May 2025
Anna Wróblewska
Bartosz Grabek
Jakub Świstak
Daniel Dan
    ELM
ArXivPDFHTML
Abstract

This research prepares an automatic pipeline for generating reliable question-answer (Q&A) tests using AI chatbots. We automatically generated a GPT-4o-mini-based Q&A test for a Natural Language Processing course and evaluated its psychometric and perceived-quality metrics with students and experts. A mixed-format IRT analysis showed that the generated items exhibit strong discrimination and appropriate difficulty, while student and expert star ratings reflect high overall quality. A uniform DIF check identified two items for review. These findings demonstrate that LLM-generated assessments can match human-authored tests in psychometric performance and user satisfaction, illustrating a scalable approach to AI-assisted assessment development.

View on arXiv
@article{wróblewska2025_2505.06591,
  title={ Evaluating LLM-Generated Q&A Test: a Student-Centered Study },
  author={ Anna Wróblewska and Bartosz Grabek and Jakub Świstak and Daniel Dan },
  journal={arXiv preprint arXiv:2505.06591},
  year={ 2025 }
}
Comments on this paper