ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.04851
  4. Cited By
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

13 December 2017
Saining Xie
Chen Sun
Jonathan Huang
Z. Tu
Kevin Patrick Murphy
    3DH
ArXivPDFHTML

Papers citing "Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"

50 / 650 papers shown
Title
Self-supervised Video Representation Learning by Uncovering
  Spatio-temporal Statistics
Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics
Jiangliu Wang
Jianbo Jiao
Linchao Bao
Shengfeng He
Wei Liu
Yunhui Liu
SSL
AI4TS
21
55
0
31 Aug 2020
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention
  and Alertness Analysis
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis
J. Ortega
Neslihan Köse
P. Cañas
Min-An Chao
A. Unnervik
Marcos Nieto
Oihana Otaegui
L. Salgado
24
91
0
27 Aug 2020
Making a Case for 3D Convolutions for Object Segmentation in Videos
Making a Case for 3D Convolutions for Object Segmentation in Videos
Sabarinath Mahadevan
A. Athar
Aljosa Osep
Sebastian Hennen
Laura Leal-Taixé
Bastian Leibe
VOS
21
86
0
26 Aug 2020
Effective Action Recognition with Embedded Key Point Shifts
Effective Action Recognition with Embedded Key Point Shifts
Haozhi Cao
Yuecong Xu
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
15
7
0
26 Aug 2020
Global-local Enhancement Network for NMFs-aware Sign Language
  Recognition
Global-local Enhancement Network for NMFs-aware Sign Language Recognition
Hezhen Hu
Wen-gang Zhou
Junfu Pu
Houqiang Li
SLR
19
51
0
24 Aug 2020
AssembleNet++: Assembling Modality Representations via Attention
  Connections
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
15
44
0
18 Aug 2020
Self-supervised Video Representation Learning by Pace Prediction
Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang
Jianbo Jiao
Yunhui Liu
SSL
AI4TS
16
233
0
13 Aug 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised
  Audio-Visual Representation Learning
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
36
106
0
13 Aug 2020
TransNet V2: An effective deep network architecture for fast shot
  transition detection
TransNet V2: An effective deep network architecture for fast shot transition detection
Tomás Soucek
Jakub Lokoč
11
118
0
11 Aug 2020
Spatiotemporal Contrastive Video Representation Learning
Spatiotemporal Contrastive Video Representation Learning
Rui Qian
Tianjian Meng
Boqing Gong
Ming-Hsuan Yang
Haoran Wang
Serge J. Belongie
Huayu Chen
SSL
AI4TS
41
492
0
09 Aug 2020
PAN: Towards Fast Action Recognition via Learning Persistence of
  Appearance
PAN: Towards Fast Action Recognition via Learning Persistence of Appearance
Can Zhang
Yuexian Zou
Guang Chen
Lei Gan
15
39
0
08 Aug 2020
Exploring Relations in Untrimmed Videos for Self-Supervised Learning
Exploring Relations in Untrimmed Videos for Self-Supervised Learning
Dezhao Luo
Bo Fang
Yu Zhou
Yucan Zhou
Dayan Wu
Weiping Wang
25
21
0
06 Aug 2020
Self-supervised Video Representation Learning Using Inter-intra
  Contrastive Framework
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework
Li Tao
Xueting Wang
T. Yamasaki
SSL
17
106
0
06 Aug 2020
Late Temporal Modeling in 3D CNN Architectures with BERT for Action
  Recognition
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
M. E. Kalfaoglu
Sinan Kalkan
A. Aydin Alatan
3DPC
33
140
0
03 Aug 2020
Residual Frames with Efficient Pseudo-3D CNN for Human Action
  Recognition
Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition
Jiawei Chen
Jenson Hsiao
C. Ho
10
5
0
03 Aug 2020
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
Samuel Albanie
Yang Liu
Arsha Nagrani
Antoine Miech
Ernesto Coto
...
Kaixu Cui
Hui Liu
Chen Wang
Yudong Jiang
Xiaoshuai Hao
34
9
0
03 Aug 2020
Learning Video Representations from Textual Web Supervision
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Jia Deng
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
40
48
0
29 Jul 2020
Approximated Bilinear Modules for Temporal Modeling
Approximated Bilinear Modules for Temporal Modeling
Xinqi Zhu
Chang Xu
Langwen Hui
Cewu Lu
Dacheng Tao
25
23
0
25 Jul 2020
AttentionNAS: Spatiotemporal Attention Cell Search for Video
  Classification
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
Xiaofang Wang
Xuehan Xiong
Maxim Neumann
A. Piergiovanni
Michael S. Ryoo
A. Angelova
Kris M. Kitani
Wei Hua
19
51
0
23 Jul 2020
Perceptron Synthesis Network: Rethinking the Action Scale Variances in
  Videos
Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos
Yuan Tian
Guangtao Zhai
Zhiyong Gao
27
0
0
22 Jul 2020
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human
  Action Recognition
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition
Sudhakar Kumawat
Manisha Verma
Yuta Nakashima
Shanmuganathan Raman
141
42
0
22 Jul 2020
Directional Temporal Modeling for Action Recognition
Directional Temporal Modeling for Action Recognition
Xinyu Li
Bing Shuai
Joseph Tighe
6
41
0
21 Jul 2020
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
424
596
0
21 Jul 2020
Hierarchical Contrastive Motion Learning for Video Action Recognition
Hierarchical Contrastive Motion Learning for Video Action Recognition
Xitong Yang
Xiaodong Yang
Sifei Liu
Deqing Sun
L. Davis
Jan Kautz
SSL
35
13
0
20 Jul 2020
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
FAtt
20
128
0
20 Jul 2020
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks
  on Mobile Devices
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices
Wei Niu
Mengshu Sun
Zechao Li
Jou-An Chen
Jiexiong Guan
Xipeng Shen
Yanzhi Wang
Sijia Liu
Xue Lin
Bin Ren
MQ
15
12
0
20 Jul 2020
Region-based Non-local Operation for Video Classification
Region-based Non-local Operation for Video Classification
Guoxi Huang
A. Bors
14
11
0
17 Jul 2020
Temporal Distinct Representation Learning for Action Recognition
Temporal Distinct Representation Learning for Action Recognition
Junwu Weng
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xudong Jiang
Junsong Yuan
11
26
0
15 Jul 2020
Alleviating Over-segmentation Errors by Detecting Action Boundaries
Alleviating Over-segmentation Errors by Detecting Action Boundaries
Yuchi Ishikawa
Seito Kasai
Y. Aoki
Hirokatsu Kataoka
14
137
0
14 Jul 2020
IntegralAction: Pose-driven Feature Integration for Robust Human Action
  Recognition in Videos
IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos
Gyeongsik Moon
Heeseung Kwon
Kyoung Mu Lee
Minsu Cho
22
26
0
13 Jul 2020
Universal-to-Specific Framework for Complex Action Recognition
Universal-to-Specific Framework for Complex Action Recognition
Peisen Zhao
Lingxi Xie
Ya Zhang
Qi Tian
19
9
0
13 Jul 2020
Aligning Videos in Space and Time
Aligning Videos in Space and Time
Senthil Purushwalkam
Tian-Chun Ye
Saurabh Gupta
Abhinav Gupta
27
23
0
09 Jul 2020
Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet
Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet
Hao Chen
Abhinav Shrivastava
23
14
0
01 Jul 2020
Self-Supervised MultiModal Versatile Networks
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
40
371
0
29 Jun 2020
Dynamic Sampling Networks for Efficient Action Recognition in Videos
Dynamic Sampling Networks for Efficient Action Recognition in Videos
Yin-Dong Zheng
Zhaoyang Liu
Tong Lu
Limin Wang
16
74
0
28 Jun 2020
Counting Out Time: Class Agnostic Video Repetition Counting in the Wild
Counting Out Time: Class Agnostic Video Repetition Counting in the Wild
Debidatta Dwibedi
Y. Aytar
Jonathan Tompson
P. Sermanet
Andrew Zisserman
AI4TS
23
109
0
27 Jun 2020
Motion Representation Using Residual Frames with 3D CNN
Motion Representation Using Residual Frames with 3D CNN
Li Tao
Xueting Wang
T. Yamasaki
3DPC
23
1
0
21 Jun 2020
Melanoma Diagnosis with Spatio-Temporal Feature Learning on Sequential
  Dermoscopic Images
Melanoma Diagnosis with Spatio-Temporal Feature Learning on Sequential Dermoscopic Images
Zhen Yu
Jennifer Nguyen
Xiaojun Chang
J. Kelly
C. Mclean
Lei Zhang
Victoria Mar
Z. Ge
MedIm
6
3
0
19 Jun 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action
  Localization
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
19
150
0
14 Jun 2020
DTG-Net: Differentiated Teachers Guided Self-Supervised Video Action
  Recognition
DTG-Net: Differentiated Teachers Guided Self-Supervised Video Action Recognition
Ziming Liu
Guangyu Gao
•. A. K. Qin
Jinyang Li
ViT
6
1
0
13 Jun 2020
Open-Narrow-Synechiae Anterior Chamber Angle Classification in AS-OCT
  Sequences
Open-Narrow-Synechiae Anterior Chamber Angle Classification in AS-OCT Sequences
Huaying Hao
Huazhu Fu
Yanwu Xu
Jianlong Yang
Fei Li
Xiulan Zhang
Jiang-Dong Liu
Yitian Zhao
107
8
0
09 Jun 2020
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local
  Module for Action Recognition
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
Yuecong Xu
Haozhi Cao
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
18
5
0
09 Jun 2020
ARID: A New Dataset for Recognizing Action in the Dark
ARID: A New Dataset for Recognizing Action in the Dark
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
24
71
0
06 Jun 2020
In the Eye of the Beholder: Gaze and Actions in First Person Video
In the Eye of the Beholder: Gaze and Actions in First Person Video
Yin Li
Miao Liu
James M. Rehg
EgoV
27
69
0
31 May 2020
Which scaling rule applies to Artificial Neural Networks
Which scaling rule applies to Artificial Neural Networks
János Végh
37
9
0
15 May 2020
Adaptive Interaction Modeling via Graph Operations Search
Adaptive Interaction Modeling via Graph Operations Search
Haoxin Li
Weishi Zheng
Yu Tao
Haifeng Hu
Jianhuang Lai
26
5
0
05 May 2020
Beyond Instructional Videos: Probing for More Diverse Visual-Textual
  Grounding on YouTube
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Jack Hessel
Zhenhai Zhu
Bo Pang
Radu Soricut
15
4
0
29 Apr 2020
Skeleton Focused Human Activity Recognition in RGB Video
Skeleton Focused Human Activity Recognition in RGB Video
Bruce X. B. Yu
Yan Liu
Keith C. C. Chan
26
4
0
29 Apr 2020
SpeedNet: Learning the Speediness in Videos
SpeedNet: Learning the Speediness in Videos
Sagie Benaim
Ariel Ephrat
Oran Lang
Inbar Mosseri
William T. Freeman
Michael Rubinstein
Michal Irani
Tali Dekel
17
257
0
13 Apr 2020
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Yizhou Zhou
Xiaoyan Sun
Chong Luo
Zhengjun Zha
Wenjun Zeng
3DPC
11
20
0
10 Apr 2020
Previous
123...10111213
Next