12
0

Pre-trained Vision-Language Models Assisted Noisy Partial Label Learning

Main:11 Pages
4 Figures
Bibliography:2 Pages
Abstract

In the context of noisy partial label learning (NPLL), each training sample is associated with a set of candidate labels annotated by multiple noisy annotators. With the emergence of high-performance pre-trained vision-language models (VLMs) such as CLIP, LLaVa and GPT-4V, the direction of using these models to replace time-consuming manual annotation workflows and achieve "manual-annotation-free" training for downstream tasks has become a highly promising research avenue. This paper focuses on learning from noisy partial labels annotated by pre-trained VLMs and proposes an innovative collaborative consistency regularization (Co-Reg) method. Unlike the symmetric noise primarily addressed in traditional noisy label learning, the noise generated by pre-trained models is instance-dependent, embodying the underlying patterns of the pre-trained models themselves, which significantly increases the learning difficulty for the model. To address this, we simultaneously train two neural networks that implement collaborative purification of training labels through a "Co-Pseudo-Labeling" mechanism, while enforcing consistency regularization constraints in both the label space and feature representation space. Our method can also leverage few-shot manually annotated valid labels to further enhance its performances. Comparative experiments with different denoising and disambiguation algorithms, annotation manners, and pre-trained model application schemes fully validate the effectiveness of the proposed method, while revealing the broad prospects of integrating weakly-supervised learning techniques into the knowledge distillation process of pre-trained models.

View on arXiv
@article{wang2025_2506.03229,
  title={ Pre-trained Vision-Language Models Assisted Noisy Partial Label Learning },
  author={ Qian-Wei Wang and Yuqiu Xie and Letian Zhang and Zimo Liu and Shu-Tao Xia },
  journal={arXiv preprint arXiv:2506.03229},
  year={ 2025 }
}
Comments on this paper