PrivacyCD: Hierarchical Unlearning for Protecting Student Privacy in Cognitive Diagnosis

6 November 2025

Mingliang Hou

Yinuo Wang

Teng Guo

Zitao Liu

Wenzhou Dou

Jiaqi Zheng

Renqiang Luo

Mi Tian

Weiqi Luo

ArXiv (abs)PDF HTML

Main:6 Pages

20 Figures

Bibliography:2 Pages

5 Tables

Appendix:12 Pages

Abstract

The need to remove specific student data from cognitive diagnosis (CD) models has become a pressing requirement, driven by users' growing assertion of their "right to be forgotten". However, existing CD models are largely designed without privacy considerations and lack effective data unlearning mechanisms. Directly applying general purpose unlearning algorithms is suboptimal, as they struggle to balance unlearning completeness, model utility, and efficiency when confronted with the unique heterogeneous structure of CD models. To address this, our paper presents the first systematic study of the data unlearning problem for CD models, proposing a novel and efficient algorithm: hierarchical importanceguided forgetting (HIF). Our key insight is that parameter importance in CD models exhibits distinct layer wise characteristics. HIF leverages this via an innovative smoothing mechanism that combines individual and layer, level importance, enabling a more precise distinction of parameters associated with the data to be unlearned. Experiments on three real world datasets show that HIF significantly outperforms baselines on key metrics, offering the first effective solution for CD models to respond to user data removal requests and for deploying high-performance, privacy preserving AI systems

View on arXiv

Comments on this paper