A Closer Look at Spatiotemporal Convolutions for Action Recognition

30 November 2017

Heng Wang

Papers citing "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

50 / 477 papers shown

Title
CNN-Based Action Recognition and Pose Estimation for Classifying Animal Behavior from Videos: A Survey Michael Perez Corey Toler-Franklin MedIm 36 14 0 15 Jan 2023
Deep Diversity-Enhanced Feature Representation of Hyperspectral Images Jinhui Hou Zhiyu Zhu Junhui Hou Hui Liu Huanqiang Zeng Deyu Meng 18 6 0 15 Jan 2023
Triple-stream Deep Metric Learning of Great Ape Behavioural Actions Otto Brookes Majid Mirmehdi H. Kühl T. Burghardt 31 14 0 06 Jan 2023
EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding Shuhan Tan Tushar Nagarajan Kristen Grauman 26 21 0 05 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring Huiyu Wang Mitesh Singh Lorenzo Torresani EgoV 72 24 0 03 Jan 2023
Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition Hasan Hammoud Shuming Liu Mohammad Alkhrashi Fahad Albalawi Guohao Li AAML 34 8 0 03 Jan 2023
StepNet: Spatial-temporal Part-aware Network for Isolated Sign Language Recognition Xi Shen Zhedong Zheng Yi Yang SLR 30 13 0 25 Dec 2022
Similarity Contrastive Estimation for Image and Video Soft Contrastive Self-Supervised Learning J. Denize Jaonary Rabarisoa Astrid Orcesi Romain Hérault SSL 19 6 0 21 Dec 2022
A Survey on Human Action Recognition Zhou Shuchang 29 0 0 20 Dec 2022
Cross-Modal Learning with 3D Deformable Attention for Action Recognition Sangwon Kim Dasom Ahn ByoungChul Ko ViT 3DPC 38 24 0 12 Dec 2022
Multimodal Vision Transformers with Forced Attention for Behavior Analysis Tanay Agrawal Michal Balazia Philippe Muller Franccois Brémond ViT 23 9 0 07 Dec 2022
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning A. Piergiovanni Weicheng Kuo A. Angelova ViT 36 54 0 06 Dec 2022
VLG: General Video Recognition with Web Textual Knowledge Jintao Lin Zhaoyang Liu Wenhai Wang Wayne Wu Limin Wang 39 0 0 03 Dec 2022
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization Chen Zhao Shuming Liu K. Mangalam Guohao Li 38 17 0 25 Nov 2022
Towards Good Practices for Missing Modality Robust Action Recognition Sangmin Woo Sumin Lee Yeonju Park Muhammad Adi Nugroho Changick Kim 22 43 0 25 Nov 2022
Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation Tsu-jui Fu Licheng Yu Ning Zhang Cheng-Yang Fu Jong-Chyi Su William Yang Wang Sean Bell VGen 61 37 0 23 Nov 2022
Dynamic Appearance: A Video Representation for Action Recognition with Joint Training Guoxi Huang A. Bors 27 1 0 23 Nov 2022
MagicVideo: Efficient Video Generation With Latent Diffusion Models Daquan Zhou Weimin Wang Hanshu Yan Weiwei Lv Yizhe Zhu Jiashi Feng DiffM VGen 39 373 0 20 Nov 2022
UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer Kunchang Li Yali Wang Yinan He Yizhuo Li Yi Wang Limin Wang Yu Qiao ViT 30 107 0 17 Nov 2022
Exploring State Change Capture of Heterogeneous Backbones @ Ego4D Hands and Objects Challenge 2022 Yin-Dong Zheng Guo Chen Jiahao Wang Tong Lu Liming Wang 40 0 0 16 Nov 2022
Dynamic Temporal Filtering in Video Models Fuchen Long Zhaofan Qiu Yingwei Pan Ting Yao Chong-Wah Ngo Tao Mei AI4TS 32 17 0 15 Nov 2022
Learning to Model Multimodal Semantic Alignment for Story Visualization Bowen Li Thomas Lukasiewicz DiffM 31 2 0 14 Nov 2022
MARLIN: Masked Autoencoder for facial video Representation LearnINg Zhixi Cai Shreya Ghosh Kalin Stefanov Abhinav Dhall Jianfei Cai Hamid Rezatofighi Reza Haffari Munawar Hayat ViT CVBM 27 60 0 12 Nov 2022
Soft-Landing Strategy for Alleviating the Task Discrepancy Problem in Temporal Action Localization Tasks Hyolim Kang Hanjung Kim Joungbin An Minsu Cho Seon Joo Kim 38 5 0 11 Nov 2022
Eat-Radar: Continuous Fine-Grained Intake Gesture Detection Using FMCW Radar and 3D Temporal Convolutional Network with Attention C. Wang T. S. Kumar W. de Raedt Guido Camps Hans Hallez Bart Vanrumste 24 12 0 08 Nov 2022
Egocentric Audio-Visual Noise Suppression Roshan S. Sharma Weipeng He Ju Lin Egor Lakomkin Yang Liu Kaustubh Kalgaonkar EgoV 24 1 0 07 Nov 2022
No-audio speaking status detection in crowded settings via visual pose-based filtering and wearable acceleration Jose Vargas-Quiros Laura Cabrera-Quiros Hayley Hung 29 1 0 01 Nov 2022
GliTr: Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction Samrudhdhi B. Rangrej Kevin J Liang Tal Hassner James J. Clark 27 3 0 24 Oct 2022
OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the Memory Usage of Neural Networks Benoit Steiner Mostafa Elhoushi Jacob Kahn James Hegarty 31 8 0 24 Oct 2022
Learning a Grammar Inducer from Massive Uncurated Instructional Videos Songyang Zhang Linfeng Song Lifeng Jin Haitao Mi Kun Xu Dong Yu Jiebo Luo 38 5 0 22 Oct 2022
Cyclical Self-Supervision for Semi-Supervised Ejection Fraction Prediction from Echocardiogram Videos Weihang Dai Xuelong Li Xinpeng Ding Kwang-Ting Cheng 46 22 0 20 Oct 2022
MovieCLIP: Visual Scene Recognition in Movies Digbalay Bose Rajat Hebbar Krishna Somandepalli Haoyang Zhang Huayu Chen K. Cole-McLaughlin Haoran Wang Shrikanth Narayanan CLIP 22 21 0 20 Oct 2022
STAR-Transformer: A Spatio-temporal Cross Attention Transformer for Human Action Recognition Dasom Ahn Sangwon Kim H. Hong ByoungChul Ko ViT 28 97 0 14 Oct 2022
S4ND: Modeling Images and Videos as Multidimensional Signals Using State Spaces Eric N. D. Nguyen Karan Goel Albert Gu Gordon W. Downs Preey Shah Tri Dao S. Baccus Christopher Ré VLM 22 39 0 12 Oct 2022
DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition Haodong Duan Jiaqi Wang Kai-xiang Chen Dahua Lin 38 42 0 12 Oct 2022
Learning State-Aware Visual Representations from Audible Interactions Himangi Mittal Pedro Morgado Unnat Jain Abhinav Gupta 78 23 0 27 Sep 2022
Leveraging Self-Supervised Training for Unintentional Action Recognition Enea Duka Anna Kukleva Bernt Schiele 35 1 0 23 Sep 2022
Adaptive Local-Component-aware Graph Convolutional Network for One-shot Skeleton-based Action Recognition Anqi Zhu Qiuhong Ke Mingming Gong James Bailey 41 21 0 21 Sep 2022
Audio-Visual Fusion for Emotion Recognition in the Valence-Arousal Space Using Joint Cross-Attention R Gnana Praveen Eric Granger P. Cardinal CVBM 56 31 0 19 Sep 2022
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain Francesco Ragusa Antonino Furnari G. Farinella EgoV 43 24 0 19 Sep 2022
Moving from 2D to 3D: volumetric medical image classification for rectal cancer staging Joohyun Lee J. Oh Inkyu Shin You-sung Kim D. Sohn Tae-Sung Kim In So Kweon MedIm 29 4 0 13 Sep 2022
A Novel Self-Knowledge Distillation Approach with Siamese Representation Learning for Action Recognition Duc-Quang Vu T. Phung Jia-Ching Wang 27 9 0 03 Sep 2022
Survey: Exploiting Data Redundancy for Optimization of Deep Learning Jou-An Chen Wei Niu Bin Ren Yanzhi Wang Xipeng Shen 23 24 0 29 Aug 2022
Lane Change Classification and Prediction with Action Recognition Networks Kai-Bin Liang Jun Wang A. Bhalerao 16 2 0 24 Aug 2022
Modality Mixer for Multi-modal Action Recognition Sumin Lee Sangmin Woo Yeonju Park Muhammad Adi Nugroho Changick Kim 26 10 0 24 Aug 2022
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement Zhi-Qi Cheng Qianwen Dai Siyao Li Teruko Mitamura Alexander G. Hauptmann 16 34 0 18 Aug 2022
Blockwise Temporal-Spatial Pathway Network SeulGi Hong Min-Kook Choi 24 1 0 05 Aug 2022
Expanding Language-Image Pretrained Models for General Video Recognition Bolin Ni Houwen Peng Minghao Chen Songyang Zhang Gaofeng Meng Jianlong Fu Shiming Xiang Haibin Ling VLM CLIP ViT 40 313 0 04 Aug 2022
Surgical Skill Assessment via Video Semantic Aggregation Zhenqiang Li Lin Gu Weimin Wang Ryosuke Nakamura Yoichi Sato 28 13 0 04 Aug 2022
Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living Zdravko Marinov David Schneider Alina Roitberg Rainer Stiefelhagen VGen 32 2 0 03 Aug 2022