ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2503.01531
49
0

Diversity Covariance-Aware Prompt Learning for Vision-Language Models

3 March 2025
Songlin Dong
Zhengdong Zhou
Chenhao Ding
Xinyuan Gao
Alex C. Kot
Yihong Gong
    VPVLM
    VLM
ArXivPDFHTML
Abstract

Prompt tuning can further enhance the performance of visual-language models across various downstream tasks (e.g., few-shot learning), enabling them to better adapt to specific applications and needs. In this paper, we present a Diversity Covariance-Aware framework that learns distributional information from the data to enhance the few-shot ability of the prompt model. First, we propose a covariance-aware method that models the covariance relationships between visual features and uses anisotropic Mahalanobis distance, instead of the suboptimal cosine distance, to measure the similarity between two modalities. We rigorously derive and prove the validity of this modeling process. Then, we propose the diversity-aware method, which learns multiple diverse soft prompts to capture different attributes of categories and aligns them independently with visual modalities. This method achieves multi-centered covariance modeling, leading to more diverse decision boundaries. Extensive experiments on 11 datasets in various tasks demonstrate the effectiveness of our method.

View on arXiv
@article{dong2025_2503.01531,
  title={ Diversity Covariance-Aware Prompt Learning for Vision-Language Models },
  author={ Songlin Dong and Zhengdong Zhou and Chenhao Ding and Xinyuan Gao and Alex Kot and Yihong Gong },
  journal={arXiv preprint arXiv:2503.01531},
  year={ 2025 }
}
Comments on this paper