LECTOR: Summarizing E-book Reading Content for Personalized Student Support

Educational e-book platforms provide valuable information to teachers and researchers through two main sources: reading activity data and reading content data. While reading activity data is commonly used to analyze learning strategies and predict low-performing students, reading content data is often overlooked in these analyses. To address this gap, this study proposes LECTOR (Lecture slides and Topic Relationships), a model that summarizes information from reading content in a format that can be easily integrated with reading activity data. Our first experiment compared LECTOR to representative Natural Language Processing (NLP) models in extracting key information from 2,255 lecture slides, showing an average improvement of 5% in F1-score. These results were further validated through a human evaluation involving 28 students, which showed an average improvement of 21% in F1-score over a model predominantly used in current educational tools. Our second experiment compared reading preferences extracted by LECTOR with traditional reading activity data in predicting low-performing students using 600,712 logs from 218 students. The results showed a tendency to improve the predictive performance by integrating LECTOR. Finally, we proposed examples showing the potential application of the reading preferences extracted by LECTOR in designing personalized interventions for students.
View on arXiv@article{zapata2025_2505.07898, title={ LECTOR: Summarizing E-book Reading Content for Personalized Student Support }, author={ Erwin Daniel López Zapata and Cheng Tang and Valdemar Švábenský and Fumiya Okubo and Atsushi Shimada }, journal={arXiv preprint arXiv:2505.07898}, year={ 2025 } }