ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.18982
44
0

Generative Data Imputation for Sparse Learner Performance Data Using Generative Adversarial Imputation Networks

23 March 2025
Liang Zhang
Jionghao Lin
John Sabatini
Diego Zapata-Rivera
Carol Forsyth
Yang Jiang
John Hollander
Xiangen Hu
Arthur C. Graesser
ArXivPDFHTML
Abstract

Learner performance data collected by Intelligent Tutoring Systems (ITSs), such as responses to questions, is essential for modeling and predicting learners' knowledge states. However, missing responses due to skips or incomplete attempts create data sparsity, challenging accurate assessment and personalized instruction. To address this, we propose a generative imputation approach using Generative Adversarial Imputation Networks (GAIN). Our method features a three-dimensional (3D) framework (learners, questions, and attempts), flexibly accommodating various sparsity levels. Enhanced by convolutional neural networks and optimized with a least squares loss function, the GAIN-based method aligns input and output dimensions to question-attempt matrices along the learners' dimension. Extensive experiments using datasets from AutoTutor Adult Reading Comprehension (ARC), ASSISTments, and MATHia demonstrate that our approach significantly outperforms tensor factorization and alternative GAN methods in imputation accuracy across different attempt scenarios. Bayesian Knowledge Tracing (BKT) further validates the effectiveness of the imputed data by estimating learning parameters: initial knowledge (P(L0)), learning rate (P(T)), guess rate (P(G)), and slip rate (P(S)). Results indicate the imputed data enhances model fit and closely mirrors original distributions, capturing underlying learning behaviors reliably. Kullback-Leibler (KL) divergence assessments confirm minimal divergence, showing the imputed data preserves essential learning characteristics effectively. These findings underscore GAIN's capability as a robust imputation tool in ITSs, alleviating data sparsity and supporting adaptive, individualized instruction, ultimately leading to more precise and responsive learner assessments and improved educational outcomes.

View on arXiv
@article{zhang2025_2503.18982,
  title={ Generative Data Imputation for Sparse Learner Performance Data Using Generative Adversarial Imputation Networks },
  author={ Liang Zhang and Jionghao Lin and John Sabatini and Diego Zapata-Rivera and Carol Forsyth and Yang Jiang and John Hollander and Xiangen Hu and Arthur C. Graesser },
  journal={arXiv preprint arXiv:2503.18982},
  year={ 2025 }
}
Comments on this paper