Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.00729
Cited By
ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection
1 November 2023
Thinh Phan
Khoa T. Vo
Duy Le
Gianfranco Doretto
Don Adjeroh
Ngan Le
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection"
16 / 16 papers shown
Title
Zero-shot Action Localization via the Confidence of Large Vision-Language Models
Josiah Aklilu
Xiaohan Wang
Serena Yeung-Levy
65
1
0
18 Oct 2024
Towards Completeness: A Generalizable Action Proposal Generator for Zero-Shot Temporal Action Localization
Jia-Run Du
Kun-Yu Lin
Jingke Meng
Wei-Shi Zheng
36
0
0
25 Aug 2024
HAT: History-Augmented Anchor Transformer for Online Temporal Action Localization
Sakib Reza
Yuexi Zhang
Mohsen Moghaddam
Mario Sznaier
38
1
0
12 Aug 2024
Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization
Jeongseok Hyun
Su Ho Han
Hyolim Kang
Joon-Young Lee
Seon Joo Kim
VLM
42
2
0
09 Jul 2024
SolarFormer: Multi-scale Transformer for Solar PV Profiling
Adrian de Luis
Minh-Triet Tran
Taisei Hanyu
Anh Tran
Haitao Liao
Roy McCann
Alan Mantooth
Ying Huang
Ngan Le
33
3
0
30 Oct 2023
Z-GMOT: Zero-shot Generic Multiple Object Tracking
Kim Hoang Tran
Anh Duy Le Dinh
Tien-Phat Nguyen
Thinh Phan
Pha Nguyen
Khoa Luu
Don Adjeroh
Gianfranco Doretto
Ngan Hoang Le
VOT
36
5
0
28 May 2023
Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Wenhao Wu
Xiaohan Wang
Haipeng Luo
Jingdong Wang
Yi Yang
Wanli Ouyang
106
48
0
31 Dec 2022
MaPLe: Multi-modal Prompt Learning
Muhammad Uzair Khattak
H. Rasheed
Muhammad Maaz
Salman Khan
Fahad Shahbaz Khan
VPVLM
VLM
212
532
0
06 Oct 2022
AOE-Net: Entities Interactions Modeling with Adaptive Attention Mechanism for Temporal Action Proposals Generation
Khoa T. Vo
Sang Truong
Kashu Yamazaki
Bhiksha Raj
Minh-Triet Tran
Ngan Le
86
26
0
05 Oct 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
152
639
0
26 May 2022
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling
Renrui Zhang
Rongyao Fang
Wei Zhang
Peng Gao
Kunchang Li
Jifeng Dai
Yu Qiao
Hongsheng Li
VLM
194
387
0
06 Nov 2021
AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation
Khoa T. Vo
Kevin Hyekang Joo
Kashu Yamazaki
Sang Truong
Kris Kitani
Minh-Triet Tran
Ngan Le
EgoV
56
17
0
21 Oct 2021
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
348
2,279
0
02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,858
0
18 Apr 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
337
3,720
0
11 Feb 2021
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation
Tianwei Lin
Xu Zhao
Haisheng Su
Chongjing Wang
Ming Yang
139
700
0
08 Jun 2018
1