Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.01134
Cited By
Learning to Prompt for Vision-Language Models
2 September 2021
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Prompt for Vision-Language Models"
50 / 391 papers shown
Title
Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition
Xin Ni
Yong Liu
Hao Wen
Yatai Ji
Jing Xiao
Yujiu Yang
37
9
0
09 Dec 2022
PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data
Roei Herzig
Ofir Abramovich
Elad Ben-Avraham
Assaf Arbelle
Leonid Karlinsky
Ariel Shamir
Trevor Darrell
Amir Globerson
41
16
0
08 Dec 2022
Decorate the Newcomers: Visual Domain Prompt for Continual Test Time Adaptation
Yulu Gan
Yan Bai
Yihang Lou
Xianzheng Ma
Renrui Zhang
Nian Shi
Lin Luo
OOD
VLM
30
91
0
08 Dec 2022
ZegCLIP: Towards Adapting CLIP for Zero-shot Semantic Segmentation
Ziqi Zhou
Bowen Zhang
Yinjie Lei
Lingqiao Liu
Yifan Liu
VLM
32
167
0
07 Dec 2022
Fine-tuned CLIP Models are Efficient Video Learners
H. Rasheed
Muhammad Uzair Khattak
Muhammad Maaz
Salman Khan
F. Khan
CLIP
VLM
31
148
0
06 Dec 2022
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification
Muhammad Ferjad Naeem
Muhammad Gul Zain Ali Khan
Yongqin Xian
Muhammad Zeshan Afzal
D. Stricker
Luc Van Gool
F. Tombari
VLM
35
51
0
05 Dec 2022
Day2Dark: Pseudo-Supervised Activity Recognition beyond Silent Daylight
Yunhua Zhang
Hazel Doughty
Cees G. M. Snoek
VLM
40
0
0
05 Dec 2022
Controllable Image Captioning via Prompting
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
19
23
0
04 Dec 2022
Improving Zero-shot Generalization and Robustness of Multi-modal Models
Yunhao Ge
Jie Jessie Ren
Andrew Gallagher
Yuxiao Wang
Ming Yang
Hartwig Adam
Laurent Itti
Balaji Lakshminarayanan
Jiaping Zhao
VLM
29
34
0
04 Dec 2022
Finetune like you pretrain: Improved finetuning of zero-shot vision models
Sachin Goyal
Ananya Kumar
Sankalp Garg
Zico Kolter
Aditi Raghunathan
CLIP
VLM
41
136
0
01 Dec 2022
Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Taihong Xiao
Zirui Wang
Liangliang Cao
Jiahui Yu
Shengyang Dai
Ming Yang
VLM
MLLM
30
5
0
29 Nov 2022
SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification
Fang Peng
Xiaoshan Yang
Linhui Xiao
Yaowei Wang
Changsheng Xu
VLM
35
43
0
28 Nov 2022
Multi-Label Continual Learning using Augmented Graph Convolutional Network
Kaile Du
Fan Lyu
Linyan Li
Fuyuan Hu
Wei Feng
Fenglei Xu
Xuefeng Xi
Hanjing Cheng
CLL
25
13
0
27 Nov 2022
CLIP-ReID: Exploiting Vision-Language Model for Image Re-Identification without Concrete Text Labels
Siyuan Li
Li Sun
Qingli Li
VLM
30
149
0
25 Nov 2022
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval
Siteng Huang
Biao Gong
Yulin Pan
Jianwen Jiang
Yiliang Lv
Yuyuan Li
Donglin Wang
VLM
VPVLM
22
41
0
23 Nov 2022
PointCMC: Cross-Modal Multi-Scale Correspondences Learning for Point Cloud Understanding
Honggu Zhou
Xiaogang Peng
Jiawei Mao
Zizhao Wu
Ming Zeng
3DPC
14
3
0
22 Nov 2022
Knowledge Prompting for Few-shot Action Recognition
Yuheng Shi
Xinxiao Wu
Hanxi Lin
VLM
19
4
0
22 Nov 2022
Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification
Yue Yang
Artemis Panagopoulou
Shenghao Zhou
Daniel Jin
Chris Callison-Burch
Mark Yatskar
40
211
0
21 Nov 2022
Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning
Xiaocheng Lu
Ziming Liu
Song Guo
Jingcai Guo
CoGe
27
30
0
19 Nov 2022
Cross-Modal Adapter for Text-Video Retrieval
Haojun Jiang
Jianke Zhang
Rui Huang
Chunjiang Ge
Zanlin Ni
Jiwen Lu
Jie Zhou
S. Song
Gao Huang
45
36
0
17 Nov 2022
FedTune: A Deep Dive into Efficient Federated Fine-Tuning with Pre-trained Transformers
Jinyu Chen
Wenchao Xu
Song Guo
Junxiao Wang
Jie Zhang
Yining Qi
FedML
28
32
0
15 Nov 2022
Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning
Shangchao Su
Min Yang
Bin Li
Xiangyang Xue
VLM
FedML
32
18
0
15 Nov 2022
OneFormer: One Transformer to Rule Universal Image Segmentation
Jitesh Jain
Jiacheng Li
M. Chiu
Ali Hassani
Nikita Orlov
Humphrey Shi
ViT
31
327
0
10 Nov 2022
Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection
Yanxin Long
Jianhua Han
Runhu Huang
Xu Hang
Yi Zhu
Chunjing Xu
Xiaodan Liang
VLM
ObjD
29
18
0
02 Nov 2022
CPL: Counterfactual Prompt Learning for Vision and Language Models
Xuehai He
Diji Yang
Weixi Feng
Tsu-jui Fu
Arjun Reddy Akula
Varun Jampani
P. Narayana
Sugato Basu
William Yang Wang
Qing Guo
VPVLM
VLM
50
15
0
19 Oct 2022
Scaling & Shifting Your Features: A New Baseline for Efficient Model Tuning
Dongze Lian
Daquan Zhou
Jiashi Feng
Xinchao Wang
36
247
0
17 Oct 2022
Robust Preference Learning for Storytelling via Contrastive Reinforcement Learning
Louis Castricato
Alexander Havrilla
Shahbuland Matiana
Michael Pieler
Anbang Ye
Ian Yang
Spencer Frazier
Mark O. Riedl
31
12
0
14 Oct 2022
Is synthetic data from generative models ready for image recognition?
Ruifei He
Shuyang Sun
Xin Yu
Chuhui Xue
Wenqing Zhang
Philip H. S. Torr
Song Bai
Xiaojuan Qi
37
285
0
14 Oct 2022
Bridging CLIP and StyleGAN through Latent Alignment for Image Editing
Wanfeng Zheng
Qiang Li
Xiaoyan Guo
Pengfei Wan
Zhong-ming Wang
73
14
0
10 Oct 2022
Learning to Decompose Visual Features with Latent Textual Prompts
Feng Wang
Manling Li
Xudong Lin
Hairong Lv
A. Schwing
Heng Ji
VLM
19
23
0
09 Oct 2022
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang
Bichen Wu
Xiaoliang Dai
Kunpeng Li
Yinan Zhao
Hang Zhang
Peizhao Zhang
Peter Vajda
Diana Marculescu
CLIP
VLM
37
433
0
09 Oct 2022
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
F. Khan
VPVLM
VLM
203
531
0
06 Oct 2022
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
Guangyi Chen
Weiran Yao
Xiangchen Song
Xinyue Li
Yongming Rao
Anton van den Hengel
VPVLM
VLM
8
62
0
03 Oct 2022
Visual Prompt Tuning for Generative Transfer Learning
Kihyuk Sohn
Yuan Hao
José Lezama
Luisa F. Polanía
Huiwen Chang
Han Zhang
Irfan Essa
Lu Jiang
VPVLM
VLM
56
81
0
03 Oct 2022
Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding
Erica K. Shimomoto
Edison Marrese-Taylor
Hiroya Takamura
Ichiro Kobayashi
Hideki Nakayama
Yusuke Miyao
27
7
0
26 Sep 2022
On-Device Domain Generalization
Kaiyang Zhou
Yuanhan Zhang
Yuhang Zang
Jingkang Yang
Chen Change Loy
Ziwei Liu
OOD
30
6
0
15 Sep 2022
Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models
Chen Henry Wu
Saman Motamed
Shaunak Srivastava
Fernando de la Torre
VLM
DiffM
21
34
0
14 Sep 2022
PromptAttack: Prompt-based Attack for Language Models via Gradient Search
Yundi Shi
Piji Li
Changchun Yin
Zhaoyang Han
Lu Zhou
Zhe Liu
AAML
SILM
24
18
0
05 Sep 2022
Semantic-Enhanced Image Clustering
Shao-Qian Cai
Li-qing Qiu
Xiaojun Chen
Qin Zhang
Long Chen
VLM
29
13
0
21 Aug 2022
Prompt Vision Transformer for Domain Generalization
Zangwei Zheng
Xiangyu Yue
Kai Wang
Yang You
VLM
VPVLM
MDE
30
51
0
18 Aug 2022
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
28
313
0
04 Aug 2022
Visual Recognition by Request
Chufeng Tang
Lingxi Xie
Xiaopeng Zhang
Xiaolin Hu
Qi Tian
VLM
16
15
0
28 Jul 2022
S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning
Yabin Wang
Zhiwu Huang
Xiaopeng Hong
CLL
VLM
27
210
0
26 Jul 2022
Zero-Shot Temporal Action Detection via Vision-Language Prompting
Sauradip Nag
Xiatian Zhu
Yi-Zhe Song
Tao Xiang
VLM
33
65
0
17 Jul 2022
Contrastive Adapters for Foundation Model Group Robustness
Michael Zhang
Christopher Ré
VLM
18
61
0
14 Jul 2022
Convolutional Bypasses Are Better Vision Transformer Adapters
Shibo Jie
Zhi-Hong Deng
VPVLM
10
131
0
14 Jul 2022
CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm
Mingye Xu
Yali Wang
Yihao Liu
Tong He
Yu Qiao
3DPC
39
17
0
12 Jul 2022
Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer
Su He
Taian Guo
Tao Dai
Ruizhi Qiao
Bo Ren
Shutao Xia
VLM
75
49
0
05 Jul 2022
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
103
93
0
04 Jul 2022
Can Language Understand Depth?
Renrui Zhang
Ziyao Zeng
Ziyu Guo
Yafeng Li
VLM
MDE
33
71
0
03 Jul 2022
Previous
1
2
3
4
5
6
7
8
Next