ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.12674
112
0
v1v2 (latest)

ParaStudent: Generating and Evaluating Realistic Student Code by Teaching LLMs to Struggle

16 July 2025
Mihran Miroyan
Rose Niousha
Joseph E. Gonzalez
Gireeja Ranade
Narges Norouzi
    AI4Ed
ArXiv (abs)PDFHTML
Main:8 Pages
12 Figures
Bibliography:4 Pages
6 Tables
Appendix:6 Pages
Abstract

Large Language Models (LLMs) have shown strong performance on programming tasks, but can they generate student-like code like real students - imperfect, iterative, and stylistically diverse? We present ParaStudent, a systematic study of LLM-based "student-like" code generation in an introductory programming course setting. Using a dataset of timestamped student submissions across multiple semesters, we design low- and high-resolution experiments to model student progress and evaluate code outputs along semantic, functional, and stylistic dimensions. Our results show that fine-tuning significantly improves alignment with real student trajectories and captures error patterns, incremental improvements, and stylistic variations more faithfully. This study shows that modeling realistic student code requires capturing learning dynamics through context-aware generation, temporal modeling, and multi-dimensional evaluation. Code for experiments and evaluation is available at \href{this https URL}{\texttt{this http URL}}.

View on arXiv
Comments on this paper