ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.04851
  4. Cited By
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in
  Video Classification
v1v2 (latest)

Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification

13 December 2017
Saining Xie
Chen Sun
Jonathan Huang
Zhuowen Tu
Kevin Patrick Murphy
    3DH
ArXiv (abs)PDFHTML

Papers citing "Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"

50 / 657 papers shown
Title
Support-set bottlenecks for video-text representation learning
Support-set bottlenecks for video-text representation learning
Mandela Patrick
Po-Yao (Bernie) Huang
Yuki M. Asano
Florian Metze
Alexander G. Hauptmann
João Henriques
Andrea Vedaldi
108
249
0
06 Oct 2020
Hierarchical Domain-Adapted Feature Learning for Video Saliency
  Prediction
Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction
Giovanni Bellitto
Federica Proietto Salanitri
S. Palazzo
Francesco Rundo
Daniela Giordano
C. Spampinato
MDE
155
56
0
02 Oct 2020
PERF-Net: Pose Empowered RGB-Flow Net
PERF-Net: Pose Empowered RGB-Flow Net
Yinxiao Li
Zhichao Lu
Xuehan Xiong
Jonathan Huang
3DH
85
17
0
28 Sep 2020
On the spatiotemporal behavior in biology-mimicking computing systems
On the spatiotemporal behavior in biology-mimicking computing systems
J. Végh
Ádám-József Berki
39
6
0
18 Sep 2020
Discovering Dynamic Salient Regions for Spatio-Temporal Graph Neural
  Networks
Discovering Dynamic Salient Regions for Spatio-Temporal Graph Neural Networks
Iulia Duta
Andrei Liviu Nicolicioiu
Marius Leordeanu
70
6
0
17 Sep 2020
Multi-Label Activity Recognition using Activity-specific Features and
  Activity Correlations
Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations
Yanyi Zhang
Xinyu Li
I. Marsic
HAI
78
24
0
16 Sep 2020
Online Spatiotemporal Action Detection and Prediction via Causal
  Representations
Online Spatiotemporal Action Detection and Prediction via Causal Representations
Gurkirt Singh
3DPCCML
82
0
0
31 Aug 2020
Self-supervised Video Representation Learning by Uncovering
  Spatio-temporal Statistics
Self-supervised Video Representation Learning by Uncovering Spatio-temporal Statistics
Jiangliu Wang
Jianbo Jiao
Linchao Bao
Shengfeng He
Wei Liu
Yunhui Liu
SSLAI4TS
68
55
0
31 Aug 2020
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention
  and Alertness Analysis
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis
J. Ortega
Neslihan Köse
P. Cañas
Min-An Chao
A. Unnervik
Marcos Nieto
Oihana Otaegui
L. Salgado
87
92
0
27 Aug 2020
Making a Case for 3D Convolutions for Object Segmentation in Videos
Making a Case for 3D Convolutions for Object Segmentation in Videos
Sabarinath Mahadevan
A. Athar
Aljosa Osep
Sebastian Hennen
Laura Leal-Taixé
Bastian Leibe
VOS
81
88
0
26 Aug 2020
Effective Action Recognition with Embedded Key Point Shifts
Effective Action Recognition with Embedded Key Point Shifts
Haozhi Cao
Yuecong Xu
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
47
7
0
26 Aug 2020
Global-local Enhancement Network for NMFs-aware Sign Language
  Recognition
Global-local Enhancement Network for NMFs-aware Sign Language Recognition
Hezhen Hu
Wen-gang Zhou
Junfu Pu
Houqiang Li
SLR
81
54
0
24 Aug 2020
AssembleNet++: Assembling Modality Representations via Attention
  Connections
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
65
45
0
18 Aug 2020
Self-supervised Video Representation Learning by Pace Prediction
Self-supervised Video Representation Learning by Pace Prediction
Jiangliu Wang
Jianbo Jiao
Yunhui Liu
SSLAI4TS
84
236
0
13 Aug 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised
  Audio-Visual Representation Learning
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
148
110
0
13 Aug 2020
TransNet V2: An effective deep network architecture for fast shot
  transition detection
TransNet V2: An effective deep network architecture for fast shot transition detection
Tomás Soucek
Jakub Lokoč
98
124
0
11 Aug 2020
Spatiotemporal Contrastive Video Representation Learning
Spatiotemporal Contrastive Video Representation Learning
Rui Qian
Tianjian Meng
Boqing Gong
Ming-Hsuan Yang
Haoran Wang
Serge J. Belongie
Huayu Chen
SSLAI4TS
140
502
0
09 Aug 2020
PAN: Towards Fast Action Recognition via Learning Persistence of
  Appearance
PAN: Towards Fast Action Recognition via Learning Persistence of Appearance
Can Zhang
Yuexian Zou
Guang Chen
Lei Gan
89
39
0
08 Aug 2020
Exploring Relations in Untrimmed Videos for Self-Supervised Learning
Exploring Relations in Untrimmed Videos for Self-Supervised Learning
Dezhao Luo
Bo Fang
Yu Zhou
Yucan Zhou
Dayan Wu
Weiping Wang
90
22
0
06 Aug 2020
Self-supervised Video Representation Learning Using Inter-intra
  Contrastive Framework
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework
Li Tao
Xueting Wang
T. Yamasaki
SSL
85
106
0
06 Aug 2020
Late Temporal Modeling in 3D CNN Architectures with BERT for Action
  Recognition
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
M. E. Kalfaoglu
Sinan Kalkan
A. Aydin Alatan
3DPC
90
143
0
03 Aug 2020
Residual Frames with Efficient Pseudo-3D CNN for Human Action
  Recognition
Residual Frames with Efficient Pseudo-3D CNN for Human Action Recognition
Jiawei Chen
Jenson Hsiao
C. Ho
55
5
0
03 Aug 2020
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
Samuel Albanie
Yang Liu
Arsha Nagrani
Antoine Miech
Ernesto Coto
...
Kaixu Cui
Hui Liu
Chen Wang
Yudong Jiang
Xiaoshuai Hao
87
9
0
03 Aug 2020
Learning Video Representations from Textual Web Supervision
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Jia Deng
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
113
48
0
29 Jul 2020
Approximated Bilinear Modules for Temporal Modeling
Approximated Bilinear Modules for Temporal Modeling
Xinqi Zhu
Chang Xu
Langwen Hui
Cewu Lu
Dacheng Tao
67
24
0
25 Jul 2020
AttentionNAS: Spatiotemporal Attention Cell Search for Video
  Classification
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
Xiaofang Wang
Xuehan Xiong
Maxim Neumann
A. Piergiovanni
Michael S. Ryoo
A. Angelova
Kris Kitani
Wei Hua
103
51
0
23 Jul 2020
Perceptron Synthesis Network: Rethinking the Action Scale Variances in
  Videos
Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos
Yuan Tian
Guangtao Zhai
Zhiyong Gao
35
0
0
22 Jul 2020
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human
  Action Recognition
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition
Sudhakar Kumawat
Manisha Verma
Yuta Nakashima
Shanmuganathan Raman
204
44
0
22 Jul 2020
Directional Temporal Modeling for Action Recognition
Directional Temporal Modeling for Action Recognition
Xinyu Li
Bing Shuai
Joseph Tighe
65
42
0
21 Jul 2020
Multi-modal Transformer for Video Retrieval
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
591
612
0
21 Jul 2020
Hierarchical Contrastive Motion Learning for Video Action Recognition
Hierarchical Contrastive Motion Learning for Video Action Recognition
Xitong Yang
Xiaodong Yang
Sifei Liu
Deqing Sun
L. Davis
Jan Kautz
SSL
108
13
0
20 Jul 2020
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
FAtt
95
128
0
20 Jul 2020
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks
  on Mobile Devices
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices
Wei Niu
Mengshu Sun
Zechao Li
Jou-An Chen
Jiexiong Guan
Xipeng Shen
Yanzhi Wang
Sijia Liu
Xue Lin
Bin Ren
MQ
64
12
0
20 Jul 2020
Region-based Non-local Operation for Video Classification
Region-based Non-local Operation for Video Classification
Guoxi Huang
A. Bors
77
11
0
17 Jul 2020
Temporal Distinct Representation Learning for Action Recognition
Temporal Distinct Representation Learning for Action Recognition
Junwu Weng
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xudong Jiang
Junsong Yuan
76
26
0
15 Jul 2020
Alleviating Over-segmentation Errors by Detecting Action Boundaries
Alleviating Over-segmentation Errors by Detecting Action Boundaries
Yuchi Ishikawa
Seito Kasai
Y. Aoki
Hirokatsu Kataoka
74
139
0
14 Jul 2020
IntegralAction: Pose-driven Feature Integration for Robust Human Action
  Recognition in Videos
IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos
Gyeongsik Moon
Heeseung Kwon
Kyoung Mu Lee
Minsu Cho
70
26
0
13 Jul 2020
Universal-to-Specific Framework for Complex Action Recognition
Universal-to-Specific Framework for Complex Action Recognition
Peisen Zhao
Lingxi Xie
Ya Zhang
Qi Tian
60
9
0
13 Jul 2020
Aligning Videos in Space and Time
Aligning Videos in Space and Time
Senthil Purushwalkam
Tian-Chun Ye
Saurabh Gupta
Abhinav Gupta
77
23
0
09 Jul 2020
Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet
Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet
Hao Chen
Abhinav Shrivastava
57
14
0
01 Jul 2020
Self-Supervised MultiModal Versatile Networks
Self-Supervised MultiModal Versatile Networks
Jean-Baptiste Alayrac
Adrià Recasens
R. Schneider
Relja Arandjelović
Jason Ramapuram
J. Fauw
Lucas Smaira
Sander Dieleman
Andrew Zisserman
SSL
192
375
0
29 Jun 2020
Dynamic Sampling Networks for Efficient Action Recognition in Videos
Dynamic Sampling Networks for Efficient Action Recognition in Videos
Yin-Dong Zheng
Zhaoyang Liu
Tong Lu
Limin Wang
77
77
0
28 Jun 2020
Counting Out Time: Class Agnostic Video Repetition Counting in the Wild
Counting Out Time: Class Agnostic Video Repetition Counting in the Wild
Debidatta Dwibedi
Y. Aytar
Jonathan Tompson
P. Sermanet
Andrew Zisserman
AI4TS
80
114
0
27 Jun 2020
Motion Representation Using Residual Frames with 3D CNN
Motion Representation Using Residual Frames with 3D CNN
Li Tao
Xueting Wang
T. Yamasaki
3DPC
47
1
0
21 Jun 2020
Melanoma Diagnosis with Spatio-Temporal Feature Learning on Sequential
  Dermoscopic Images
Melanoma Diagnosis with Spatio-Temporal Feature Learning on Sequential Dermoscopic Images
Zhen Yu
Jennifer Nguyen
Xiaojun Chang
J. Kelly
C. Mclean
Lei Zhang
Victoria Mar
Z. Ge
MedIm
18
3
0
19 Jun 2020
Actor-Context-Actor Relation Network for Spatio-Temporal Action
  Localization
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization
Junting Pan
Siyu Chen
Zheng Shou
Yu Liu
Jing Shao
Hongsheng Li
3DPC
108
151
0
14 Jun 2020
DTG-Net: Differentiated Teachers Guided Self-Supervised Video Action
  Recognition
DTG-Net: Differentiated Teachers Guided Self-Supervised Video Action Recognition
Ziming Liu
Guangyu Gao
•. A. K. Qin
Jinyang Li
ViT
52
1
0
13 Jun 2020
Open-Narrow-Synechiae Anterior Chamber Angle Classification in AS-OCT
  Sequences
Open-Narrow-Synechiae Anterior Chamber Angle Classification in AS-OCT Sequences
Huaying Hao
Huazhu Fu
Yanwu Xu
Jianlong Yang
Fei Li
Xiulan Zhang
Jiang-Dong Liu
Yitian Zhao
233
8
0
09 Jun 2020
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local
  Module for Action Recognition
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
Yuecong Xu
Haozhi Cao
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
56
5
0
09 Jun 2020
ARID: A New Dataset for Recognizing Action in the Dark
ARID: A New Dataset for Recognizing Action in the Dark
Yuecong Xu
Jianfei Yang
Haozhi Cao
K. Mao
Jianxiong Yin
Simon See
77
73
0
06 Jun 2020
Previous
123...1011121314
Next