ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.12764
  4. Cited By
VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval

23 November 2022
Siteng Huang
Biao Gong
Yulin Pan
Jianwen Jiang
Yiliang Lv
Yuyuan Li
Donglin Wang
    VLM
    VPVLM
ArXivPDFHTML

Papers citing "VoP: Text-Video Co-operative Prompt Tuning for Cross-Modal Retrieval"

23 / 23 papers shown
Title
DMPT: Decoupled Modality-aware Prompt Tuning for Multi-modal Object Re-identification
DMPT: Decoupled Modality-aware Prompt Tuning for Multi-modal Object Re-identification
Minghui Lin
Shu Wang
Xiang Wang
Jianhua Tang
Longbin Fu
Zhengrong Zuo
Nong Sang
VLM
145
0
0
15 Apr 2025
Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval
Leveraging Modality Tags for Enhanced Cross-Modal Video Retrieval
A. Fragomeni
Dima Damen
Michael Wray
84
0
0
02 Apr 2025
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
Zichen Liu
Kunlun Xu
Fuchun Sun
Xu Zou
Yuxin Peng
Jiahuan Zhou
VLM
AI4TS
135
2
0
20 Mar 2025
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval
Leqi Shen
Tianxiang Hao
Tao He
Sicheng Zhao
Pengzhang Liu
Yongjun Bao
Guiguang Ding
Guiguang Ding
199
10
0
02 Sep 2024
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
Tz-Ying Wu
Kyle Min
Subarna Tripathi
Nuno Vasconcelos
EgoV
106
0
0
28 Jul 2024
Learning to Prompt for Open-Vocabulary Object Detection with
  Vision-Language Model
Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model
Yu Du
Fangyun Wei
Zihe Zhang
Miaojing Shi
Yue Gao
Guoqi Li
VPVLM
VLM
61
330
0
28 Mar 2022
Visual Prompt Tuning
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
124
1,576
0
23 Mar 2022
Prompting Visual-Language Models for Efficient Video Understanding
Prompting Visual-Language Models for Efficient Video Understanding
Chen Ju
Tengda Han
Kunhao Zheng
Ya Zhang
Weidi Xie
VPVLM
VLM
59
371
0
08 Dec 2021
Object-aware Video-language Pre-training for Retrieval
Object-aware Video-language Pre-training for Retrieval
Alex Jinpeng Wang
Yixiao Ge
Guanyu Cai
Rui Yan
Xudong Lin
Ying Shan
Xiaohu Qie
Mike Zheng Shou
ViT
VLM
55
80
0
01 Dec 2021
Towards a Unified View of Parameter-Efficient Transfer Learning
Towards a Unified View of Parameter-Efficient Transfer Learning
Junxian He
Chunting Zhou
Xuezhe Ma
Taylor Berg-Kirkpatrick
Graham Neubig
AAML
85
918
0
08 Oct 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
457
2,340
0
02 Sep 2021
HANet: Hierarchical Alignment Networks for Video-Text Retrieval
HANet: Hierarchical Alignment Networks for Video-Text Retrieval
Peng Wu
Xiangteng He
Mingqian Tang
Yiliang Lv
Jing Liu
67
53
0
26 Jul 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based
  Masked Language-models
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
147
1,191
0
18 Jun 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
478
3,952
0
18 Apr 2021
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery
Or Patashnik
Zongze Wu
Eli Shechtman
Daniel Cohen-Or
Dani Lischinski
CLIP
VLM
73
1,204
0
31 Mar 2021
GPT Understands, Too
GPT Understands, Too
Xiao Liu
Yanan Zheng
Zhengxiao Du
Ming Ding
Yujie Qian
Zhilin Yang
Jie Tang
VLM
145
1,161
0
18 Mar 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse
  Sampling
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
110
651
0
11 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy
  Text Supervision
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
405
3,778
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
343
2,016
0
09 Feb 2021
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
511
602
0
21 Jul 2020
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million
  Narrated Video Clips
HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips
Antoine Miech
Dimitri Zhukov
Jean-Baptiste Alayrac
Makarand Tapaswi
Ivan Laptev
Josef Sivic
VGen
101
1,192
0
07 Jun 2019
Localizing Moments in Video with Natural Language
Localizing Moments in Video with Natural Language
Lisa Anne Hendricks
Oliver Wang
Eli Shechtman
Josef Sivic
Trevor Darrell
Bryan C. Russell
95
940
0
04 Aug 2017
SGDR: Stochastic Gradient Descent with Warm Restarts
SGDR: Stochastic Gradient Descent with Warm Restarts
I. Loshchilov
Frank Hutter
ODL
246
8,030
0
13 Aug 2016
1