Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.11797
Cited By
CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models
24 September 2021
Yuan Yao
Ao Zhang
Zhengyan Zhang
Zhiyuan Liu
Tat-Seng Chua
Maosong Sun
MLLM
VPVLM
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CPT: Colorful Prompt Tuning for Pre-trained Vision-Language Models"
50 / 160 papers shown
Title
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive Survey
Jindong Li
Y. Li
Yali Fu
Jiahong Liu
Yixin Liu
Menglin Yang
Irwin King
VLM
38
0
0
19 Apr 2025
Visual Position Prompt for MLLM based Visual Grounding
Wei Tang
Yanpeng Sun
Qinying Gu
Zechao Li
VLM
50
0
0
19 Mar 2025
3DAxisPrompt: Promoting the 3D Grounding and Reasoning in GPT-4o
Dingning Liu
Cheng Wang
Peng Gao
Renrui Zhang
Xinzhu Ma
Yuan Meng
Zhihui Wang
LRM
44
0
0
17 Mar 2025
PARIC: Probabilistic Attention Regularization for Language Guided Image Classification from Pre-trained Vison Language Models
Mayank Nautiyal
Stela Arranz Gheorghe
Kristiana Stefa
Li Ju
Ida-Maria Sintorn
Prashant Singh
VLM
54
0
0
14 Mar 2025
Embodied Crowd Counting
Runling Long
Yunlong Wang
Jia Wan
Xiang Deng
Xinting Zhu
Weili Guan
Antoni B. Chan
Liqiang Nie
63
0
0
11 Mar 2025
Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation
Kuanghong Liu
Jin Wang
Kangjian He
Dan Xu
Xuejie Zhang
49
0
0
08 Mar 2025
Introducing Visual Perception Token into Multimodal Large Language Model
Runpeng Yu
Xinyin Ma
Xinchao Wang
MLLM
LRM
75
0
0
24 Feb 2025
Parameter-Efficient Fine-Tuning for Foundation Models
Dan Zhang
Tao Feng
Lilong Xue
Yuandong Wang
Yuxiao Dong
J. Tang
46
7
0
23 Jan 2025
Exploring the Use of Contrastive Language-Image Pre-Training for Human Posture Classification: Insights from Yoga Pose Analysis
Andrzej D. Dobrzycki
Ana M. Bernardos
Luca Bergesio
Andrzej Pomirski
Daniel Sáez-Trigueros
3DH
38
3
0
13 Jan 2025
Towards Visual Grounding: A Survey
Linhui Xiao
Xiaoshan Yang
X. Lan
Yaowei Wang
Changsheng Xu
ObjD
55
3
0
31 Dec 2024
Tuning Vision-Language Models with Candidate Labels by Prompt Alignment
Zhifang Zhang
Yuwei Niu
Xin Liu
Beibei Li
VPVLM
VLM
62
0
0
31 Dec 2024
Visual Fourier Prompt Tuning
Runjia Zeng
Cheng Han
Qifan Wang
Chunshu Wu
Tong Geng
Lifu Huang
Ying Nian Wu
Dongfang Liu
VPVLM
VLM
50
6
0
02 Nov 2024
Language-guided Hierarchical Fine-grained Image Forgery Detection and Localization
Xiao Guo
Xiaohong Liu
I. Masi
Xiaoming Liu
95
9
0
31 Oct 2024
Deep Correlated Prompting for Visual Recognition with Missing Modalities
Lianyu Hu
Tongkai Shi
Wei Feng
Fanhua Shang
Liang Wan
VLM
29
1
0
09 Oct 2024
TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models
Rabin Adhikari
Safal Thapaliya
Manish Dhakal
Bishesh Khanal
MLLM
VLM
30
0
0
07 Oct 2024
Recent Advances of Multimodal Continual Learning: A Comprehensive Survey
Dianzhi Yu
Xinni Zhang
Yankai Chen
Aiwei Liu
Yifei Zhang
Philip S. Yu
Irwin King
VLM
CLL
44
9
0
07 Oct 2024
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
Ming Dai
Lingfeng Yang
Yihao Xu
Zhenhua Feng
Wankou Yang
ObjD
27
9
0
26 Sep 2024
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu
Weihao Yu
Xinchao Wang
VLM
37
6
0
25 Sep 2024
EAGLE: Towards Efficient Arbitrary Referring Visual Prompts Comprehension for Multimodal Large Language Models
Jiacheng Zhang
Yang Jiao
Shaoxiang Chen
Jingjing Chen
Yu-Gang Jiang
28
1
0
25 Sep 2024
Rethinking Prompting Strategies for Multi-Label Recognition with Partial Annotations
Samyak Rawlekar
Shubhang Bhatnagar
Narendra Ahuja
VLM
31
1
0
12 Sep 2024
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling
William Y. Zhu
Keren Ye
Junjie Ke
Jiahui Yu
Leonidas J. Guibas
P. Milanfar
Feng Yang
45
2
0
07 Aug 2024
MaskInversion: Localized Embeddings via Optimization of Explainability Maps
Walid Bousselham
Sofian Chaybouti
Christian Rupprecht
Vittorio Ferrari
Hilde Kuehne
67
0
0
29 Jul 2024
Navi2Gaze: Leveraging Foundation Models for Navigation and Target Gazing
Jun Zhu
Zihao Du
Haotian Xu
Fengbo Lan
Zilong Zheng
Bo Ma
Shengjie Wang
Tao Zhang
36
4
0
12 Jul 2024
FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance
Jiedong Zhuang
Jiaqi Hu
Lianrui Mu
Rui Hu
Xiaoyu Liang
Jiangnan Ye
Haoji Hu
CLIP
VLM
34
2
0
08 Jul 2024
SafaRi:Adaptive Sequence Transformer for Weakly Supervised Referring Expression Segmentation
Sayan Nag
Koustava Goswami
Srikrishna Karanam
42
2
0
02 Jul 2024
Towards Open-World Grasping with Large Vision-Language Models
Georgios Tziafas
H. Kasaei
LM&Ro
LRM
29
12
0
26 Jun 2024
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Jinhao Li
Haopeng Li
S. Erfani
Lei Feng
James Bailey
Feng Liu
VLM
29
3
0
05 Jun 2024
ProGEO: Generating Prompts through Image-Text Contrastive Learning for Visual Geo-localization
Chen Mao
Jingqi Hu
26
4
0
04 Jun 2024
Improving Multi-label Recognition using Class Co-Occurrence Probabilities
Samyak Rawlekar
Shubhang Bhatnagar
Vishnuvardhan Pogunulu Srinivasulu
Narendra Ahuja
VLM
34
5
0
24 Apr 2024
Monocular 3D lane detection for Autonomous Driving: Recent Achievements, Challenges, and Outlooks
Fulong Ma
Weiqing Qi
Guoyang Zhao
Linwei Zheng
Sheng Wang
Yuxuan Liu
Ming-Yu Liu
74
9
0
10 Apr 2024
Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero shot Medical Image Segmentation
Sidra Aleem
Fangyijie Wang
Mayug Maniparambil
Eric Arazo
J. Dietlmeier
Guénolé Silvestre
Kathleen M. Curran
Noel E. O'Connor
Suzanne Little
VLM
MedIm
27
11
0
09 Apr 2024
Cross-Modal Conditioned Reconstruction for Language-guided Medical Image Segmentation
Xiaoshuang Huang
Hongxiang Li
Meng Cao
Long Chen
Chenyu You
Dong An
VLM
41
5
0
03 Apr 2024
Training-Free Semantic Segmentation via LLM-Supervision
Wenfang Sun
Yingjun Du
Gaowen Liu
Ramana Rao Kompella
Cees G. M. Snoek
VLM
44
2
0
31 Mar 2024
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin
Xinyu Wei
Ruichuan An
Peng Gao
Bocheng Zou
Yulin Luo
Siyuan Huang
Shanghang Zhang
Hongsheng Li
VLM
63
33
0
29 Mar 2024
CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning
Ziyang Gong
Fuhao Li
Yupeng Deng
Deblina Bhattacharjee
Xianzheng Ma
Xiangwei Zhu
Zhenming Ji
70
9
0
26 Mar 2024
Data-Efficient 3D Visual Grounding via Order-Aware Referring
Tung-Yu Wu
Sheng-Yu Huang
Yu-Chiang Frank Wang
34
0
0
25 Mar 2024
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM
YiXuan Wu
Yizhou Wang
Shixiang Tang
Wenhao Wu
Tong He
Wanli Ouyang
Jian Wu
Philip H. S. Torr
ObjD
VLM
32
18
0
19 Mar 2024
RESTORE: Towards Feature Shift for Vision-Language Prompt Learning
Yuncheng Yang
Chuyan Zhang
Zuopeng Yang
Yuting Gao
Yulei Qin
Ke Li
Xing Sun
Jie-jin Yang
Yun Gu
VLM
VPVLM
47
0
0
10 Mar 2024
Test-time Distribution Learning Adapter for Cross-modal Visual Reasoning
Yi Zhang
Ce Zhang
VLM
28
1
0
10 Mar 2024
Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation
Zhekai Du
Xinyao Li
Fengling Li
Ke Lu
Lei Zhu
Jingjing Li
40
15
0
05 Mar 2024
Enhancing Vision-Language Pre-training with Rich Supervisions
Yuan Gao
Kunyu Shi
Pengkai Zhu
Edouard Belval
Oren Nuriel
Srikar Appalaraju
Shabnam Ghadar
Vijay Mahadevan
Zhuowen Tu
Stefano Soatto
VLM
CLIP
64
12
0
05 Mar 2024
Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training
David Wan
Jaemin Cho
Elias Stengel-Eskin
Mohit Bansal
VLM
ObjD
51
29
0
04 Mar 2024
Data-free Multi-label Image Recognition via LLM-powered Prompt Tuning
Shuo Yang
Zirui Shang
Yongqi Wang
Derong Deng
Hongwei Chen
Qiyuan Cheng
Xinxiao Wu
VLM
31
6
0
02 Mar 2024
Repositioning the Subject within Image
Yikai Wang
Chenjie Cao
Ke Fan
Qiaole Dong
Yifan Li
Xiangyang Xue
Yanwei Fu
DiffM
36
1
0
30 Jan 2024
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Aleksandar Stanić
Sergi Caelles
Michael Tschannen
LRM
VLM
25
9
0
03 Jan 2024
Test-Time Personalization with Meta Prompt for Gaze Estimation
Huan Liu
Julia Qi
Zhenhao Li
Mohammad Hassanpour
Yang Wang
Konstantinos Plataniotis
Yuanhao Yu
32
4
0
03 Jan 2024
Temporal Adaptive RGBT Tracking with Modality Prompt
Hongyu Wang
Xiaotao Liu
Yifan Li
Meng Sun
Dian Yuan
Jing Liu
31
28
0
02 Jan 2024
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Haozhan Shen
Tiancheng Zhao
Mingwei Zhu
Jianwei Yin
VLM
ObjD
86
11
0
22 Dec 2023
Parrot Captions Teach CLIP to Spot Text
Yiqi Lin
Conghui He
Alex Jinpeng Wang
Bin Wang
Weijia Li
Mike Zheng Shou
36
7
0
21 Dec 2023
Pedestrian Attribute Recognition via CLIP based Prompt Vision-Language Fusion
Xiao Wang
Jiandong Jin
Chenglong Li
Jin Tang
Cheng Zhang
Wei Wang
VLM
15
13
0
17 Dec 2023
1
2
3
4
Next