ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17692
45
0

ViP2^22-CLIP: Visual-Perception Prompting with Unified Alignment for Zero-Shot Anomaly Detection

23 May 2025
Ziteng Yang
Jingzehua Xu
Yanshu Li
Zepeng Li
Yeqiang Wang
Xinghui Li
    VLM
ArXiv (abs)PDFHTML
Abstract

Zero-shot anomaly detection (ZSAD) aims to detect anomalies without any target domain training samples, relying solely on external auxiliary data. Existing CLIP-based methods attempt to activate the model's ZSAD potential via handcrafted or static learnable prompts. The former incur high engineering costs and limited semantic coverage, whereas the latter apply identical descriptions across diverse anomaly types, thus fail to adapt to complex variations. Furthermore, since CLIP is originally pretrained on large-scale classification tasks, its anomaly segmentation quality is highly sensitive to the exact wording of class names, severely constraining prompting strategies that depend on class labels. To address these challenges, we introduce ViP2^{2}2-CLIP. The key insight of ViP2^{2}2-CLIP is a Visual-Perception Prompting (ViP-Prompt) mechanism, which fuses global and multi-scale local visual context to adaptively generate fine-grained textual prompts, eliminating manual templates and class-name priors. This design enables our model to focus on precise abnormal regions, making it particularly valuable when category labels are ambiguous or privacy-constrained. Extensive experiments on 15 industrial and medical benchmarks demonstrate that ViP2^{2}2-CLIP achieves state-of-the-art performance and robust cross-domain generalization.

View on arXiv
@article{yang2025_2505.17692,
  title={ ViP$^2$-CLIP: Visual-Perception Prompting with Unified Alignment for Zero-Shot Anomaly Detection },
  author={ Ziteng Yang and Jingzehua Xu and Yanshu Li and Zepeng Li and Yeqiang Wang and Xinghui Li },
  journal={arXiv preprint arXiv:2505.17692},
  year={ 2025 }
}
Main:8 Pages
23 Figures
Bibliography:2 Pages
20 Tables
Appendix:12 Pages
Comments on this paper