ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.18633
22
0

Statistical Inference for Clustering-based Anomaly Detection

25 April 2025
Nguyen Thi Minh Phu
Duong Tan Loc
Vo Nguyen Le Duy
ArXivPDFHTML
Abstract

Unsupervised anomaly detection (AD) is a fundamental problem in machine learning and statistics. A popular approach to unsupervised AD is clustering-based detection. However, this method lacks the ability to guarantee the reliability of the detected anomalies. In this paper, we propose SI-CLAD (Statistical Inference for CLustering-based Anomaly Detection), a novel statistical framework for testing the clustering-based AD results. The key strength of SI-CLAD lies in its ability to rigorously control the probability of falsely identifying anomalies, maintaining it below a pre-specified significance level α\alphaα (e.g., α=0.05\alpha = 0.05α=0.05). By analyzing the selection mechanism inherent in clustering-based AD and leveraging the Selective Inference (SI) framework, we prove that false detection control is attainable. Moreover, we introduce a strategy to boost the true detection rate, enhancing the overall performance of SI-CLAD. Extensive experiments on synthetic and real-world datasets provide strong empirical support for our theoretical findings, showcasing the superior performance of the proposed method.

View on arXiv
@article{phu2025_2504.18633,
  title={ Statistical Inference for Clustering-based Anomaly Detection },
  author={ Nguyen Thi Minh Phu and Duong Tan Loc and Vo Nguyen Le Duy },
  journal={arXiv preprint arXiv:2504.18633},
  year={ 2025 }
}
Comments on this paper