ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.04851
  4. Cited By
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

13 December 2017
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
    3DH
ArXivPDFHTML

Papers citing "Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"

50 / 650 papers shown
Title
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned
  Meta-Adaptation
Unsupervised Few-Shot Action Recognition via Action-Appearance Aligned Meta-Adaptation
Jay Patravali
Gaurav Mittal
Ye Yu
Fuxin Li
Mei Chen
18
19
0
30 Sep 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
259
560
0
28 Sep 2021
TSM: Temporal Shift Module for Efficient and Scalable Video
  Understanding on Edge Device
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Jackson Wang
Song Han
40
64
0
27 Sep 2021
Joint Multimedia Event Extraction from Video and Article
Joint Multimedia Event Extraction from Video and Article
Brian Chen
Xudong Lin
Christopher Thomas
Manling Li
Shoya Yoshida
Lovish Chum
Heng Ji
Shih-Fu Chang
VGen
47
26
0
27 Sep 2021
Group Shift Pointwise Convolution for Volumetric Medical Image
  Segmentation
Group Shift Pointwise Convolution for Volumetric Medical Image Segmentation
Junjun He
Jin Ye
Cheng Li
Diping Song
Wanli Chen
Shanshan Wang
Lixu Gu
Yu Qiao
21
3
0
26 Sep 2021
Audio-Visual Speech Recognition is Worth 32$\times$32$\times$8 Voxels
Audio-Visual Speech Recognition is Worth 32×\times×32×\times×8 Voxels
Dmitriy Serdyuk
Otavio Braga
Olivier Siohan
ViT
31
7
0
20 Sep 2021
Towards High-Quality Temporal Action Detection with Sparse Proposals
Towards High-Quality Temporal Action Detection with Sparse Proposals
Jiannan Wu
Pei Sun
Shoufa Chen
Jiewen Yang
Zihao Qi
Lan Ma
Ping Luo
ViT
40
10
0
18 Sep 2021
ActionCLIP: A New Paradigm for Video Action Recognition
ActionCLIP: A New Paradigm for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Yong Liu
VLM
152
362
0
17 Sep 2021
Overview of Tencent Multi-modal Ads Video Understanding Challenge
Overview of Tencent Multi-modal Ads Video Understanding Challenge
Zhenzhi Wang
Liyu Wu
Zhimin Li
Jiangfeng Xiong
Qinglin Lu
32
4
0
16 Sep 2021
Deep Visual Navigation under Partial Observability
Deep Visual Navigation under Partial Observability
Bo Ai
Wei Gao
Vinay
David Hsu
21
10
0
16 Sep 2021
Multi-modal Representation Learning for Video Advertisement Content
  Structuring
Multi-modal Representation Learning for Video Advertisement Content Structuring
Daya Guo
Zhaoyang Zeng
30
4
0
04 Sep 2021
Revisiting 3D ResNets for Video Recognition
Revisiting 3D ResNets for Video Recognition
Xianzhi Du
Yeqing Li
Huayu Chen
Rui Qian
Jing Li
Irwan Bello
56
17
0
03 Sep 2021
Hierarchical 3D Feature Learning for Pancreas Segmentation
Hierarchical 3D Feature Learning for Pancreas Segmentation
Federica Proietto Salanitri
Giovanni Bellitto
Ismail Irmakci
S. Palazzo
Ulas Bagci
C. Spampinato
MedIm
11
10
0
03 Sep 2021
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced
  Operator Fusion
DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion
Wei Niu
Jiexiong Guan
Yanzhi Wang
G. Agrawal
Bin Ren
AI4CE
30
146
0
30 Aug 2021
Efficient Visual Recognition with Deep Neural Networks: A Survey on
  Recent Advances and New Directions
Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions
Yang Wu
Dingheng Wang
Xiaotong Lu
Fan Yang
Guoqi Li
W. Dong
Jianbo Shi
29
18
0
30 Aug 2021
Searching for Two-Stream Models in Multivariate Space for Video
  Recognition
Searching for Two-Stream Models in Multivariate Space for Video Recognition
Xinyu Gong
Heng Wang
Zheng Shou
Matt Feiszli
Zhangyang Wang
Zhicheng Yan
42
9
0
30 Aug 2021
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Xuefan Zha
Wentao Zhu
Tingxun Lv
Sen Yang
Ji Liu
AI4TS
ViT
33
27
0
26 Aug 2021
Identity-aware Graph Memory Network for Action Detection
Identity-aware Graph Memory Network for Action Detection
Jingcheng Ni
Jie Qin
Di Huang
28
9
0
26 Aug 2021
Spatio-Temporal Self-Attention Network for Video Saliency Prediction
Spatio-Temporal Self-Attention Network for Video Saliency Prediction
Ziqiang Wang
Zhi Liu
Gongyang Li
Yang Wang
Tianhong Zhang
Lihua Xu
Jijun Wang
3DPC
38
44
0
24 Aug 2021
ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning
ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning
Zhiwu Qing
Ziyuan Huang
Shiwei Zhang
Mingqian Tang
Changxin Gao
M. Ang
Ronglei Ji
Nong Sang
43
3
0
24 Aug 2021
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment
Jianwei Yang
Yonatan Bisk
Jianfeng Gao
27
137
0
23 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action
  Recognition
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
26
77
0
20 Aug 2021
Self-Supervised Video Representation Learning with Meta-Contrastive
  Network
Self-Supervised Video Representation Learning with Meta-Contrastive Network
Yuanze Lin
Xun Guo
Yan Lu
SSL
19
41
0
19 Aug 2021
Multi-Camera Trajectory Forecasting with Trajectory Tensors
Multi-Camera Trajectory Forecasting with Trajectory Tensors
Olly Styles
T. Guha
Victor Sanchez
27
7
0
10 Aug 2021
TrUMAn: Trope Understanding in Movies and Animations
TrUMAn: Trope Understanding in Movies and Animations
Hung-Ting Su
Po-Wei Shen
Bing-Chen Tsai
Wen-Feng Cheng
Ke-Jyun Wang
Winston H. Hsu
12
4
0
10 Aug 2021
Video Contrastive Learning with Global Context
Video Contrastive Learning with Global Context
Haofei Kuang
Yi Zhu
Zhi-Li Zhang
Xinyu Li
Joseph Tighe
Sören Schwertfeger
C. Stachniss
Mu Li
SSL
AI4TS
21
60
0
05 Aug 2021
Token Shift Transformer for Video Classification
Token Shift Transformer for Video Classification
Hao Zhang
Y. Hao
Chong-Wah Ngo
ViT
29
116
0
05 Aug 2021
Enhancing Self-supervised Video Representation Learning via Multi-level
  Feature Optimization
Enhancing Self-supervised Video Representation Learning via Multi-level Feature Optimization
Rui Qian
Yuxi Li
Huabin Liu
John See
Shuangrui Ding
Xian Liu
Dian Li
Weiyao Lin
35
42
0
04 Aug 2021
Skeleton Cloud Colorization for Unsupervised 3D Action Representation
  Learning
Skeleton Cloud Colorization for Unsupervised 3D Action Representation Learning
Siyuan Yang
Jun Liu
Shijian Lu
Meng Hwa Er
Alex C. Kot
3DH
3DPC
41
91
0
04 Aug 2021
Temporal Alignment Prediction for Few-Shot Video Classification
Temporal Alignment Prediction for Few-Shot Video Classification
Fei Pan
Chunlei Xu
Jie Guo
Yanwen Guo
AI4TS
21
1
0
26 Jul 2021
Spatio-Temporal Representation Factorization for Video-based Person
  Re-Identification
Spatio-Temporal Representation Factorization for Video-based Person Re-Identification
Abhishek Aich
Meng Zheng
Srikrishna Karanam
Terrence Chen
A. Roy-Chowdhury
Ziyan Wu
37
70
0
25 Jul 2021
Transcript to Video: Efficient Clip Sequencing from Texts
Transcript to Video: Efficient Clip Sequencing from Texts
Yu Xiong
Fabian Caba Heilbron
Dahua Lin
CLIP
28
10
0
25 Jul 2021
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Adaptive Recursive Circle Framework for Fine-grained Action Recognition
Hanxi Lin
Xinxiao Wu
Jiebo Luo
25
1
0
25 Jul 2021
EAN: Event Adaptive Network for Enhanced Action Recognition
EAN: Event Adaptive Network for Enhanced Action Recognition
Yuan Tian
Yichao Yan
Guangtao Zhai
G. Guo
Zhiyong Gao
35
41
0
22 Jul 2021
Let's Play for Action: Recognizing Activities of Daily Living by
  Learning from Life Simulation Video Games
Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video Games
Alina Roitberg
David Schneider
Aulia Djamal
C. Seibold
Simon Reiß
Rainer Stiefelhagen
36
30
0
12 Jul 2021
Delta Sampling R-BERT for limited data and low-light action recognition
Delta Sampling R-BERT for limited data and low-light action recognition
Sanchit Hira
Ritwik Das
Abhinav Modi
D. Pakhomov
75
17
0
12 Jul 2021
Aligning Correlation Information for Domain Adaptation in Action
  Recognition
Aligning Correlation Information for Domain Adaptation in Action Recognition
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
24
38
0
11 Jul 2021
Modality specific U-Net variants for biomedical image segmentation: A
  survey
Modality specific U-Net variants for biomedical image segmentation: A survey
Narinder Singh Punn
Sonali Agarwal
SSeg
32
144
0
09 Jul 2021
Video 3D Sampling for Self-supervised Representation Learning
Video 3D Sampling for Self-supervised Representation Learning
Wei Li
Dezhao Luo
Bo Fang
Yu Zhou
Weiping Wang
20
6
0
08 Jul 2021
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge
  Transfer
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
Zineng Tang
Jaemin Cho
Hao Tan
Joey Tianyi Zhou
VLM
32
29
0
06 Jul 2021
Inter-intra Variant Dual Representations forSelf-supervised Video
  Recognition
Inter-intra Variant Dual Representations forSelf-supervised Video Recognition
Lin Zhang
Qi She
Zhengyang Shen
Changhu Wang
SSL
30
9
0
02 Jul 2021
iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding
  and Emotion Analysis
iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding and Emotion Analysis
Xin Liu
Henglin Shi
Haoyu Chen
Zitong Yu
Xiaobai Li
Guoying Zhao
21
80
0
01 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
25
543
0
30 Jun 2021
When Video Classification Meets Incremental Classes
When Video Classification Meets Incremental Classes
Hanbin Zhao
Xin Qin
Shihao Su
Yongjian Fu
Zibo Lin
Xi Li
CLL
21
28
0
30 Jun 2021
Long-Short Temporal Modeling for Efficient Action Recognition
Long-Short Temporal Modeling for Efficient Action Recognition
Liyu Wu
Yuexian Zou
Can Zhang
21
1
0
30 Jun 2021
Unsupervised Discovery of Actions in Instructional Videos
Unsupervised Discovery of Actions in Instructional Videos
A. Piergiovanni
A. Angelova
Michael S. Ryoo
Irfan Essa
19
3
0
28 Jun 2021
Hyperbolic Busemann Learning with Ideal Prototypes
Hyperbolic Busemann Learning with Ideal Prototypes
Mina Ghadimi Atigh
Martin Keller-Ressel
Pascal Mettes
33
36
0
28 Jun 2021
Can An Image Classifier Suffice For Action Recognition?
Can An Image Classifier Suffice For Action Recognition?
Quanfu Fan
Chun-Fu Chen
Chen
Yikang Shen
ViT
34
33
0
26 Jun 2021
Hierarchical Object-oriented Spatio-Temporal Reasoning for Video
  Question Answering
Hierarchical Object-oriented Spatio-Temporal Reasoning for Video Question Answering
Long Hoang Dang
T. Le
Vuong Le
T. Tran
30
60
0
25 Jun 2021
Video Swin Transformer
Video Swin Transformer
Ze Liu
Jia Ning
Yue Cao
Yixuan Wei
Zheng-Wei Zhang
Stephen Lin
Han Hu
ViT
44
1,444
0
24 Jun 2021
Previous
123...789...111213
Next