Video-to-Task Learning via Motion-Guided Attention for Few-Shot Action Recognition

18 November 2024

Papers citing "Video-to-Task Learning via Motion-Guided Attention for Few-Shot Action Recognition"

25 / 25 papers shown

Title
A Comprehensive Review of Few-shot Action Recognition Yuyang Wanyan Xiaoshan Yang Weiming Dong Changsheng Xu VLM 155 3 0 20 Jul 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey Yi Xin Jianjiang Yang Haodi Zhou Junlong Du Junlong Du Yue Fan Qing Li Qing Li Yuntao Du VLM 146 85 0 03 Feb 2024
D $^2$ ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition Wenjie Pei Qizhong Tan Guangming Lu Jiandong Tian 112 3 0 03 Dec 2023
MoLo: Motion-augmented Long-short Contrastive Learning for Few-shot Action Recognition Xiang Wang Shiwei Zhang Zhiwu Qing Changxin Gao Yingya Zhang Deli Zhao Nong Sang 71 45 0 03 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models Junnan Li Dongxu Li Silvio Savarese Steven C. H. Hoi VLM MLLM 432 4,656 0 30 Jan 2023
Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification Jijie Wu Dongliang Chang Aneeshan Sain Xiaoxu Li Zhanyu Ma Jie Cao Jun Guo Yi-Zhe Song 78 35 0 30 Nov 2022
ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning Junting Pan Ziyi Lin Xiatian Zhu Jing Shao Hongsheng Li 90 206 0 27 Jun 2022
Vision Transformer Adapter for Dense Predictions Zhe Chen Yuchen Duan Wenhai Wang Junjun He Tong Lu Jifeng Dai Yu Qiao 138 567 0 17 May 2022
Motion-driven Visual Tempo Learning for Video-based Action Recognition Yuanzhong Liu Junsong Yuan Zhigang Tu 64 59 0 24 Feb 2022
CLIP-Adapter: Better Vision-Language Models with Feature Adapters Peng Gao Shijie Geng Renrui Zhang Teli Ma Rongyao Fang Yongfeng Zhang Hongsheng Li Yu Qiao VLM CLIP 332 1,050 0 09 Oct 2021
MLP-Mixer: An all-MLP Architecture for Vision Ilya O. Tolstikhin N. Houlsby Alexander Kolesnikov Lucas Beyer Xiaohua Zhai ... Andreas Steiner Daniel Keysers Jakob Uszkoreit Mario Lucic Alexey Dosovitskiy 441 2,694 0 04 May 2021
Learning Transferable Visual Models From Natural Language Supervision Alec Radford Jong Wook Kim Chris Hallacy Aditya A. Ramesh Gabriel Goh ... Amanda Askell Pamela Mishkin Jack Clark Gretchen Krueger Ilya Sutskever CLIP VLM 1.0K 29,926 0 26 Feb 2021
Temporal-Relational CrossTransformers for Few-Shot Action Recognition Toby Perrett A. Masullo T. Burghardt Majid Mirmehdi Dima Damen ViT 87 149 0 15 Jan 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai ... Matthias Minderer G. Heigold Sylvain Gelly Jakob Uszkoreit N. Houlsby ViT 684 41,563 0 22 Oct 2020
A Two-Stage Approach to Few-Shot Learning for Image Recognition Debasmit Das C. S. George Lee 95 126 0 10 Dec 2019
Few-Shot Video Classification via Temporal Alignment Kaidi Cao Jingwei Ji Zhangjie Cao C. Chang Juan Carlos Niebles AI4TS 86 241 0 27 Jun 2019
TSM: Temporal Shift Module for Efficient Video Understanding Ji Lin Chuang Gan Song Han 98 1,694 0 20 Nov 2018
The "something something" video database for learning and evaluating visual common sense Raghav Goyal Samira Ebrahimi Kahou Vincent Michalski Joanna Materzynska S. Westphal ... Moritz Mueller-Freitag F. Hoppe Christian Thurau Ingo Bax Roland Memisevic VLM 103 1,542 0 13 Jun 2017
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset João Carreira Andrew Zisserman 240 8,045 0 22 May 2017
Prototypical Networks for Few-shot Learning Jake C. Snell Kevin Swersky R. Zemel 305 8,164 0 15 Mar 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks Chelsea Finn Pieter Abbeel Sergey Levine OOD 833 11,961 0 09 Mar 2017
Temporal Segment Networks: Towards Good Practices for Deep Action Recognition Limin Wang Yuanjun Xiong Zhe Wang Yu Qiao Dahua Lin Xiaoou Tang Luc Van Gool ViT 120 3,841 0 02 Aug 2016
Matching Networks for One Shot Learning Oriol Vinyals Charles Blundell Timothy Lillicrap Koray Kavukcuoglu Daan Wierstra VLM 378 7,343 0 13 Jun 2016
Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman 261 7,545 0 09 Jun 2014
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild K. Soomro Amir Zamir M. Shah CLIP VGen 165 6,170 0 03 Dec 2012