ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.04851
  4. Cited By
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification
v1v2 (latest)

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

13 December 2017
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
    3DH
ArXiv (abs)PDFHTML

Papers citing "Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"

50 / 657 papers shown
Title
Constrained Mean Shift for Representation Learning
Constrained Mean Shift for Representation Learning
Ajinkya Tejankar
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
SSL
54
0
0
19 Oct 2021
LSTC: Boosting Atomic Action Detection with Long-Short-Term Context
LSTC: Boosting Atomic Action Detection with Long-Short-Term Context
Yuxi Li
Boshen Zhang
Jian Li
Yabiao Wang
Weiyao Lin
Chengjie Wang
Jilin Li
Feiyue Huang
74
5
0
19 Oct 2021
MAAD: A Model and Dataset for "Attended Awareness" in Driving
MAAD: A Model and Dataset for "Attended Awareness" in Driving
Deepak Gopinath
Guy Rosman
Simon Stent
K. Terahata
L. Fletcher
B. Argall
John J. Leonard
38
10
0
16 Oct 2021
Benchmarking the Robustness of Spatial-Temporal Models Against
  Corruptions
Benchmarking the Robustness of Spatial-Temporal Models Against Corruptions
Chenyu Yi
Siyuan Yang
Haoliang Li
Yap-Peng Tan
Alex C. Kot
92
33
0
13 Oct 2021
TAda! Temporally-Adaptive Convolutions for Video Understanding
TAda! Temporally-Adaptive Convolutions for Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Mingqian Tang
Ziwei Liu
M. Ang
140
49
0
12 Oct 2021
Early Melanoma Diagnosis with Sequential Dermoscopic Images
Early Melanoma Diagnosis with Sequential Dermoscopic Images
Zhen Yu
Jennifer Nguyen
Toàn D. Nguyên
J. Kelly
C. Mclean
Paul Bonnington
Lei Zhang
Victoria Mar
Z. Ge
56
44
0
12 Oct 2021
Video Is Graph: Structured Graph Module for Video Action Recognition
Video Is Graph: Structured Graph Module for Video Action Recognition
Rongjie Li
Xiaojun Wu
Tianyang Xu
93
12
0
12 Oct 2021
Spatio-Temporal Video Representation Learning for AI Based Video
  Playback Style Prediction
Spatio-Temporal Video Representation Learning for AI Based Video Playback Style Prediction
Rishubh Parihar
Gaurav Ramola
Ranajit Saha
Raviprasad Kini
Aniket Rege
S. Velusamy
72
1
0
03 Oct 2021
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned
  Meta-Adaptation
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation
Jay Patravali
Gaurav Mittal
Ye Yu
Fuxin Li
Mei Chen
92
19
0
30 Sep 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIPVLM
317
582
0
28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video
  Understanding on Edge Device
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Wang
Song Han
100
65
0
27 Sep 2021
Joint Multimedia Event Extraction from Video and Article
Joint Multimedia Event Extraction from Video and Article
Brian Chen
Xudong Lin
Christopher Thomas
Manling Li
Shoya Yoshida
Lovish Chum
Heng Ji
Shih-Fu Chang
VGen
83
26
0
27 Sep 2021
Group Shift Pointwise Convolution for Volumetric Medical Image
  Segmentation
Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation
Junjun He
Jin Ye
Cheng Li
Diping Song
Wanli Chen
Shanshan Wang
Lixu Gu
Yu Qiao
48
3
0
26 Sep 2021
Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels
Audio-Visual Speech Recognition is Worth 32×\times×32×\times×8 Voxels
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
ViT
87
7
0
20 Sep 2021
Towards High-Quality Temporal Action Detection with Sparse Proposals
Towards High-Quality Temporal Action Detection with Sparse Proposals
Jiannan Wu
Pei Sun
Shoufa Chen
Jiewen Yang
Zihao Qi
Lan Ma
Ping Luo
ViT
73
10
0
18 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
221
372
0
17 Sep 2021
Overview of Tencent Multi-modal Ads Video Understanding Challenge
Overview of Tencent Multi-modal Ads Video Understanding Challenge
Zhenzhi Wang
Liyu Wu
Zhimin Li
Jiangfeng Xiong
Qinglin Lu
58
4
0
16 Sep 2021
Deep Visual Navigation under Partial Observability
Deep Visual Navigation under Partial Observability
Bo Ai
Wei Gao
Vinay
David Hsu
78
11
0
16 Sep 2021
Multi-modal Representation Learning for Video Advertisement Content
  Structuring
Multi-modal Representation Learning for Video Advertisement Content Structuring
Daya Guo
Zhaoyang Zeng
41
4
0
04 Sep 2021
Revisiting 3D ResNets for Video Recognition
Revisiting 3D ResNets for Video Recognition
Xianzhi Du
Yeqing Li
Huayu Chen
Rui Qian
Jing Li
Irwan Bello
160
17
0
03 Sep 2021
Hierarchical 3D Feature Learning for Pancreas Segmentation
Hierarchical 3D Feature Learning for Pancreas Segmentation
Federica Proietto Salanitri
Giovanni Bellitto
Ismail Irmakci
S. Palazzo
Ulas Bagci
C. Spampinato
MedIm
44
9
0
03 Sep 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced
  Operator Fusion
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
76
153
0
30 Aug 2021
Efficient Visual Recognition with Deep Neural Networks: A Survey on
  Recent Advances and New Directions
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
Yang Wu
Dingheng Wang
Xiaotong Lu
Fan Yang
Guoqi Li
W. Dong
Jianbo Shi
104
18
0
30 Aug 2021
Searching for Two-Stream Models in Multivariate Space for Video
  Recognition
Searching for Two-Stream Models in Multivariate Space for Video Recognition
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
87
9
0
30 Aug 2021
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Xuefan Zha
Wentao Zhu
Tingxun Lv
Sen Yang
Ji Liu
AI4TSViT
88
27
0
26 Aug 2021
Identity-aware Graph Memory Network for Action Detection
Identity-aware Graph Memory Network for Action Detection
Jingcheng Ni
Jie Qin
Di Huang
81
9
0
26 Aug 2021
Spatio-Temporal Self-Attention Network for Video Saliency Prediction
Spatio-Temporal Self-Attention Network for Video Saliency Prediction
Ziqiang Wang
Zhi Liu
Gongyang Li
Yang Wang
Tianhong Zhang
Lihua Xu
Jijun Wang
3DPC
106
47
0
24 Aug 2021
ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning
ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning
Zhiwu Qing
Ziyuan Huang
Shiwei Zhang
Mingqian Tang
Changxin Gao
M. Ang
Ronglei Ji
Nong Sang
83
3
0
24 Aug 2021
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
Jianwei Yang
Yonatan Bisk
Jianfeng Gao
113
140
0
23 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action
  Recognition
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
101
78
0
20 Aug 2021
Self-Supervised Video Representation Learning with Meta-Contrastive
  Network
Self-Supervised Video Representation Learning with Meta-Contrastive Network
Yuanze Lin
Xun Guo
Yan Lu
SSL
71
41
0
19 Aug 2021
Multi-Camera Trajectory Forecasting with Trajectory Tensors
Multi-Camera Trajectory Forecasting with Trajectory Tensors
Olly Styles
T. Guha
Victor Sanchez
38
7
0
10 Aug 2021
TrUMAn: Trope Understanding in Movies and Animations
TrUMAn: Trope Understanding in Movies and Animations
Hung-Ting Su
Po-Wei Shen
Bing-Chen Tsai
Wen-Feng Cheng
Ke-Jyun Wang
Winston H. Hsu
26
6
0
10 Aug 2021
Video Contrastive Learning with Global Context
Video Contrastive Learning with Global Context
Haofei Kuang
Yi Zhu
Zhi-Li Zhang
Xinyu Li
Joseph Tighe
Sören Schwertfeger
C. Stachniss
Mu Li
SSLAI4TS
91
60
0
05 Aug 2021
Token Shift Transformer for Video Classification
Token Shift Transformer for Video Classification
Hao Zhang
Y. Hao
Chong-Wah Ngo
ViT
82
119
0
05 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level
  Feature Optimization
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
82
42
0
04 Aug 2021
Skeleton Cloud Colorization for Unsupervised 3D Action Representation
  Learning
Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learning
Siyuan Yang
Jun Liu
Shijian Lu
Meng Hwa Er
Alex C. Kot
3DH3DPC
115
95
0
04 Aug 2021
Temporal Alignment Prediction for Few-Shot Video Classification
Temporal Alignment Prediction for Few-Shot Video Classification
Fei Pan
Chunlei Xu
Jie Guo
Yanwen Guo
AI4TS
41
1
0
26 Jul 2021
Spatio-Temporal Representation Factorization for Video-based Person
  Re-Identification
Spatio-Temporal Representation Factorization for Video-based Person Re-Identification
Abhishek Aich
Meng Zheng
Srikrishna Karanam
Terrence Chen
Amit K. Roy-Chowdhury
Ziyan Wu
126
72
0
25 Jul 2021
Transcript to Video: Efficient Clip Sequencing from Texts
Transcript to Video: Efficient Clip Sequencing from Texts
Yu Xiong
Fabian Caba Heilbron
Dahua Lin
CLIP
62
10
0
25 Jul 2021
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Hanxi Lin
Xinxiao Wu
Jiebo Luo
65
2
0
25 Jul 2021
EAN: Event Adaptive Network for Enhanced Action Recognition
EAN: Event Adaptive Network for Enhanced Action Recognition
Yuan Tian
Yichao Yan
Guangtao Zhai
G. Guo
Zhiyong Gao
79
42
0
22 Jul 2021
Let's Play for Action: Recognizing Activities of Daily Living by
  Learning from Life Simulation Video Games
Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video Games
Alina Roitberg
David Schneider
Aulia Djamal
C. Seibold
Simon Reiß
Rainer Stiefelhagen
91
31
0
12 Jul 2021
Delta Sampling R-BERT for limited data and low-light action recognition
Delta Sampling R-BERT for limited data and low-light action recognition
Sanchit Hira
Ritwik Das
Abhinav Modi
D. Pakhomov
109
17
0
12 Jul 2021
Aligning Correlation Information for Domain Adaptation in Action
  Recognition
Aligning Correlation Information for Domain Adaptation in Action Recognition
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
87
39
0
11 Jul 2021
Modality specific U-Net variants for biomedical image segmentation: A
  survey
Modality specific U-Net variants for biomedical image segmentation: A survey
Narinder Singh Punn
Sonali Agarwal
SSeg
104
151
0
09 Jul 2021
Video 3D Sampling for Self-supervised Representation Learning
Video 3D Sampling for Self-supervised Representation Learning
Wei Li
Dezhao Luo
Bo Fang
Yu Zhou
Weiping Wang
62
7
0
08 Jul 2021
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge
  Transfer
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
Zineng Tang
Jaemin Cho
Hao Tan
Joey Tianyi Zhou
VLM
59
29
0
06 Jul 2021
Inter-intra Variant Dual Representations forSelf-supervised Video
  Recognition
Inter-intra Variant Dual Representations forSelf-supervised Video Recognition
Lin Zhang
Qi She
Zhengyang Shen
Changhu Wang
SSL
78
9
0
02 Jul 2021
iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding
  and Emotion Analysis
iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding and Emotion Analysis
Xin Liu
Henglin Shi
Haoyu Chen
Zitong Yu
Xiaobai Li
Guoying Zhao
82
83
0
01 Jul 2021
Previous
123...789...121314
Next