RDI: An adversarial robustness evaluation metric for deep neural networks based on sample clustering features

16 April 2025

Abstract

Deep neural networks (DNNs) are highly susceptible to adversarial samples, raising concerns about their reliability in safety-critical tasks. Currently, methods of evaluating adversarial robustness are primarily categorized into attack-based and certified robustness evaluation approaches. The former not only relies on specific attack algorithms but also is highly time-consuming, while the latter due to its analytical nature, is typically difficult to implement for large and complex models. A few studies evaluate model robustness based on the model's decision boundary, but they suffer from low evaluation accuracy. To address the aforementioned issues, we propose a novel adversarial robustness evaluation metric, Robustness Difference Index (RDI), which is based on sample clustering features. RDI draws inspiration from clustering evaluation by analyzing the intra-class and inter-class distances of feature vectors separated by the decision boundary to quantify model robustness. It is attack-independent and has high computational efficiency. Experiments show that, RDI demonstrates a stronger correlation with the gold-standard adversarial robustness metric of attack success rate (ASR). The average computation time of RDI is only 1/30 of the evaluation method based on the PGD attack. Our open-source code is available at:this https URL.

View on arXiv

@article{song2025_2504.18556,
  title={ RDI: An adversarial robustness evaluation metric for deep neural networks based on sample clustering features },
  author={ Jialei Song and Xingquan Zuo and Feiyang Wang and Hai Huang and Tianle Zhang },
  journal={arXiv preprint arXiv:2504.18556},
  year={ 2025 }
}

Comments on this paper