ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.01455
  4. Cited By
Rethinking Zero-shot Video Classification: End-to-end Training for
  Realistic Applications

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

3 March 2020
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
    VLM
ArXivPDFHTML

Papers citing "Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications"

21 / 21 papers shown
Title
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Shreyank N. Gowda
Boyan Gao
Xiao Gu
Xiaobo Jin
VLM
41
0
0
02 Apr 2025
Zero-Shot Skeleton-based Action Recognition with Dual Visual-Text
  Alignment
Zero-Shot Skeleton-based Action Recognition with Dual Visual-Text Alignment
Jidong Kuang
Hongsong Wang
Chaolei Han
Jie Gui
37
1
0
22 Sep 2024
AWT: Transferring Vision-Language Models via Augmentation, Weighting,
  and Transportation
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
Yuhan Zhu
Yuyang Ji
Zhiyu Zhao
Gangshan Wu
Limin Wang
VLM
39
7
0
05 Jul 2024
Potential Field Based Deep Metric Learning
Potential Field Based Deep Metric Learning
Shubhang Bhatnagar
Narendra Ahuja
42
1
0
28 May 2024
Language-based Action Concept Spaces Improve Video Self-Supervised
  Learning
Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Kanchana Ranasinghe
Michael S. Ryoo
SSL
VLM
34
12
0
20 Jul 2023
A metric learning approach for endoscopic kidney stone identification
A metric learning approach for endoscopic kidney stone identification
Jorge Gonzalez-Zapata
F. Lopez-Tiro
Elias Villalvazo-Avila
Daniel Flores-Araiza
Jacques Hubert
Andres Mendez-Vazquez
G. Ochoa-Ruiz
C. Daul
28
4
0
13 Jul 2023
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting
Syed Talal Wasim
Muzammal Naseer
Salman Khan
F. Khan
M. Shah
VLM
VPVLM
30
73
0
06 Apr 2023
VicTR: Video-conditioned Text Representations for Activity Recognition
VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya
Anurag Arnab
Arsha Nagrani
Michael S. Ryoo
31
19
0
05 Apr 2023
Fine-tuned CLIP Models are Efficient Video Learners
Fine-tuned CLIP Models are Efficient Video Learners
H. Rasheed
Muhammad Uzair Khattak
Muhammad Maaz
Salman Khan
F. Khan
CLIP
VLM
21
148
0
06 Dec 2022
Expanding Language-Image Pretrained Models for General Video Recognition
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
25
312
0
04 Aug 2022
Temporal and cross-modal attention for audio-visual zero-shot learning
Temporal and cross-modal attention for audio-visual zero-shot learning
Otniel-Bogdan Mercea
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
30
25
0
20 Jul 2022
Revisiting Classifier: Transferring Vision-Language Models for Video
  Recognition
Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu
Zhun Sun
Wanli Ouyang
VLM
99
93
0
04 Jul 2022
Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in
  Deep Metric Learning
Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning
Natalie Dullerud
Karsten Roth
Kimia Hamidieh
Nicolas Papernot
Marzyeh Ghassemi
24
15
0
23 Mar 2022
Non-isotropy Regularization for Proxy-based Deep Metric Learning
Non-isotropy Regularization for Proxy-based Deep Metric Learning
Karsten Roth
Oriol Vinyals
Zeynep Akata
19
36
0
16 Mar 2022
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
End-to-End Semantic Video Transformer for Zero-Shot Action Recognition
Keval Doshi
Yasin Yılmaz
ViT
21
2
0
10 Mar 2022
Tell me what you see: A zero-shot action recognition method based on
  natural language descriptions
Tell me what you see: A zero-shot action recognition method based on natural language descriptions
Valter Estevam
Rayson Laroca
David Menotti
Hélio Pedrini
25
13
0
18 Dec 2021
ActionCLIP: A New Paradigm for Video Action Recognition
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
149
362
0
17 Sep 2021
Object Priors for Classifying and Localizing Unseen Actions
Object Priors for Classifying and Localizing Unseen Actions
Pascal Mettes
William Thong
Cees G. M. Snoek
19
20
0
10 Apr 2021
DFS: A Diverse Feature Synthesis Model for Generalized Zero-Shot
  Learning
DFS: A Diverse Feature Synthesis Model for Generalized Zero-Shot Learning
Bonan Li
Xuecheng Nie
Congying Han
DiffM
21
0
0
19 Mar 2021
A Comprehensive Study of Deep Video Action Recognition
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
35
184
0
11 Dec 2020
Learning Attributes Equals Multi-Source Domain Generalization
Learning Attributes Equals Multi-Source Domain Generalization
Chuang Gan
Tianbao Yang
Boqing Gong
OOD
152
197
0
03 May 2016
1