RDI: An adversarial robustness evaluation metric for deep neural networks based on sample clustering features

Deep neural networks (DNNs) are highly susceptible to adversarial samples, raising concerns about their reliability in safety-critical tasks. Currently, methods of evaluating adversarial robustness are primarily categorized into attack-based and certified robustness evaluation approaches. The former not only relies on specific attack algorithms but also is highly time-consuming, while the latter due to its analytical nature, is typically difficult to implement for large and complex models. A few studies evaluate model robustness based on the model's decision boundary, but they suffer from low evaluation accuracy. To address the aforementioned issues, we propose a novel adversarial robustness evaluation metric, Robustness Difference Index (RDI), which is based on sample clustering features. RDI draws inspiration from clustering evaluation by analyzing the intra-class and inter-class distances of feature vectors separated by the decision boundary to quantify model robustness. It is attack-independent and has high computational efficiency. Experiments show that, RDI demonstrates a stronger correlation with the gold-standard adversarial robustness metric of attack success rate (ASR). The average computation time of RDI is only 1/30 of the evaluation method based on the PGD attack. Our open-source code is available at:this https URL.
View on arXiv@article{song2025_2504.18556, title={ RDI: An adversarial robustness evaluation metric for deep neural networks based on sample clustering features }, author={ Jialei Song and Xingquan Zuo and Feiyang Wang and Hai Huang and Tianle Zhang }, journal={arXiv preprint arXiv:2504.18556}, year={ 2025 } }