Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.01455
Cited By
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
3 March 2020
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications"
21 / 21 papers shown
Title
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Shreyank N. Gowda
Boyan Gao
Xiao Gu
Xiaobo Jin
VLM
41
0
0
02 Apr 2025
Zero-Shot Skeleton-based Action Recognition with Dual Visual-Text Alignment
Jidong Kuang
Hongsong Wang
Chaolei Han
Jie Gui
37
1
0
22 Sep 2024
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
Yuhan Zhu
Yuyang Ji
Zhiyu Zhao
Gangshan Wu
Limin Wang
VLM
39
7
0
05 Jul 2024
Potential Field Based Deep Metric Learning
Shubhang Bhatnagar
Narendra Ahuja
42
1
0
28 May 2024
Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Kanchana Ranasinghe
Michael S. Ryoo
SSL
VLM
34
12
0
20 Jul 2023
A metric learning approach for endoscopic kidney stone identification
Jorge Gonzalez-Zapata
F. Lopez-Tiro
Elias Villalvazo-Avila
Daniel Flores-Araiza
Jacques Hubert
Andres Mendez-Vazquez
G. Ochoa-Ruiz
C. Daul
28
4
0
13 Jul 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
F. Khan
M. Shah
VLM
VPVLM
30
73
0
06 Apr 2023
VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya
Anurag Arnab
Arsha Nagrani
Michael S. Ryoo
31
19
0
05 Apr 2023
Fine-tuned CLIP Models are Efficient Video Learners
H. Rasheed
Muhammad Uzair Khattak
Muhammad Maaz
Salman Khan
F. Khan
CLIP
VLM
21
148
0
06 Dec 2022
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
25
312
0
04 Aug 2022
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
30
25
0
20 Jul 2022
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
99
93
0
04 Jul 2022
Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning
Natalie Dullerud
Karsten Roth
Kimia Hamidieh
Nicolas Papernot
Marzyeh Ghassemi
24
15
0
23 Mar 2022
Non-isotropy Regularization for Proxy-based Deep Metric Learning
Karsten Roth
Oriol Vinyals
Zeynep Akata
19
36
0
16 Mar 2022
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
Keval Doshi
Yasin Yılmaz
ViT
21
2
0
10 Mar 2022
Tell me what you see: A zero-shot action recognition method based on natural language descriptions
Valter Estevam
Rayson Laroca
David Menotti
Hélio Pedrini
25
13
0
18 Dec 2021
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
362
0
17 Sep 2021
Object Priors for Classifying and Localizing Unseen Actions
Pascal Mettes
William Thong
Cees G. M. Snoek
19
20
0
10 Apr 2021
DFS: A Diverse Feature Synthesis Model for Generalized Zero-Shot Learning
Bonan Li
Xuecheng Nie
Congying Han
DiffM
21
0
0
19 Mar 2021
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
35
184
0
11 Dec 2020
Learning Attributes Equals Multi-Source Domain Generalization
Chuang Gan
Tianbao Yang
Boqing Gong
OOD
152
197
0
03 May 2016
1