Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.04851
Cited By
Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification
13 December 2017
Saining Xie
Chen Sun
Jonathan Huang
Z. Tu
Kevin Patrick Murphy
3DH
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-offs in Video Classification"
50 / 650 papers shown
Title
Efficient Spatialtemporal Context Modeling for Action Recognition
Congqi Cao
Yue Lu
Yifan Zhang
D. Jiang
Yanning Zhang
29
4
0
20 Mar 2021
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
24
128
0
19 Mar 2021
NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition
Pengzhen Ren
Gang Xiao
Xiaojun Chang
Yun Xiao
Zhihui Li
Xiaojiang Chen
ViT
26
4
0
17 Mar 2021
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Andrew Shin
Masato Ishii
T. Narihira
35
37
0
06 Mar 2021
Unsupervised Motion Representation Enhanced Network for Action Recognition
Xiaohang Yang
Lingtong Kong
Jie Yang
16
4
0
05 Mar 2021
VA-RED
2
^2
2
: Video Adaptive Redundancy Reduction
Bowen Pan
Yikang Shen
Camilo Luciano Fosco
Chung-Ching Lin
A. Andonian
Yue Meng
Kate Saenko
A. Oliva
Rogerio Feris
15
19
0
15 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
46
647
0
11 Feb 2021
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
Yue Meng
Yikang Shen
Chung-Ching Lin
P. Sattigeri
Leonid Karlinsky
Kate Saenko
A. Oliva
Rogerio Feris
70
62
0
10 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
283
1,984
0
09 Feb 2021
Bridging the gap between Human Action Recognition and Online Action Detection
Alban Main De Boissiere
R. Noumeir
22
0
0
21 Jan 2021
Few-shot Action Recognition with Prototype-centered Attentive Learning
Xiatian Zhu
Antoine Toisoul
Juan-Manuel Prez-Ra
Li Zhang
Brais Martínez
Tao Xiang
39
53
0
20 Jan 2021
TCLR: Temporal Contrastive Learning for Video Representation
I. Dave
Rohit Gupta
Mamshad Nayeem Rizve
Mubarak Shah
SSL
AI4TS
34
175
0
20 Jan 2021
3D-ANAS: 3D Asymmetric Neural Architecture Search for Fast Hyperspectral Image Classification
Haokui Zhang
Chengrong Gong
Yunpeng Bai
Zongwen Bai
Ying Li
14
27
0
12 Jan 2021
Learning from Weakly-labeled Web Videos via Exploring Sub-Concepts
Kunpeng Li
Zizhao Zhang
Guanhang Wu
Xuehan Xiong
Chen-Yu Lee
Zhichao Lu
Y. Fu
Tomas Pfister
29
5
0
11 Jan 2021
2D or not 2D? Adaptive 3D Convolution Selection for Efficient Video Recognition
Hengduo Li
Zuxuan Wu
Abhinav Shrivastava
L. Davis
27
35
0
29 Dec 2020
Global Context Networks
Yue Cao
Jiarui Xu
Stephen Lin
Fangyun Wei
Han Hu
ISeg
36
96
0
24 Dec 2020
Human Action Recognition from Various Data Modalities: A Review
Zehua Sun
Qiuhong Ke
Hossein Rahmani
Mohammed Bennamoun
Gang Wang
Jun Liu
MU
53
504
0
22 Dec 2020
TDN: Temporal Difference Networks for Efficient Action Recognition
Limin Wang
Zhan Tong
Bin Ji
Gangshan Wu
23
391
0
18 Dec 2020
Multi-shot Temporal Event Localization: a Benchmark
Xiaolong Liu
Yao Hu
S. Bai
Fei Ding
X. Bai
Philip Torr
46
81
0
17 Dec 2020
FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation
Tarun Kalluri
Deepak Pathak
Manmohan Chandraker
Du Tran
VGen
25
142
0
15 Dec 2020
GTA: Global Temporal Attention for Video Action Understanding
Bo He
Xitong Yang
Zuxuan Wu
Hao Chen
Ser-Nam Lim
Abhinav Shrivastava
ViT
33
27
0
15 Dec 2020
NUTA: Non-uniform Temporal Aggregation for Action Recognition
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Hao Chen
Joseph Tighe
ViT
16
16
0
15 Dec 2020
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
ViNet: Pushing the limits of Visual Modality for Audio-Visual Saliency Prediction
Samyak Jain
P. Yarlagadda
Shreyank Jyoti
Shyamgopal Karthik
Subramanian Ramanathan
Vineet Gandhi
ViT
29
66
0
11 Dec 2020
Look Before you Speak: Visually Contextualized Utterances
Paul Hongsuck Seo
Arsha Nagrani
Cordelia Schmid
21
66
0
10 Dec 2020
Diverse Temporal Aggregation and Depthwise Spatiotemporal Factorization for Efficient Video Classification
Youngwan Lee
Hyungil Kim
Kimin Yun
Jinyoung Moon
26
12
0
01 Dec 2020
Recent Progress in Appearance-based Action Recognition
J. Humphreys
Zhe Chen
Dacheng Tao
24
0
0
25 Nov 2020
A3D: Adaptive 3D Networks for Video Action Recognition
Sijie Zhu
Taojiannan Yang
Matías Mendieta
Chong Chen
3DH
29
12
0
24 Nov 2020
Play Fair: Frame Attributions in Video Models
Will Price
Dima Damen
FAtt
23
5
0
24 Nov 2020
QuerYD: A video dataset with high-quality text and audio narrations
Andreea-Maria Oncescu
João F. Henriques
Yang Liu
Andrew Zisserman
Samuel Albanie
VGen
16
11
0
22 Nov 2020
We don't Need Thousand Proposals
:
\colon
:
Single Shot Actor-Action Detection in Videos
A. J. Rana
Yogesh S Rawat
ViT
13
11
0
22 Nov 2020
3D CNNs with Adaptive Temporal Feature Resolutions
Mohsen Fayyaz
Emad Bahrami Rad
Ali Diba
M. Noroozi
Ehsan Adeli
Luc Van Gool
Juergen Gall
3DPC
21
30
0
17 Nov 2020
ActBERT: Learning Global-Local Video-Text Representations
Linchao Zhu
Yi Yang
ViT
46
417
0
14 Nov 2020
Multimodal Pretraining for Dense Video Captioning
Gabriel Huang
Bo Pang
Zhenhai Zhu
Clara E. Rivera
Radu Soricut
21
81
0
10 Nov 2020
Temporal Stochastic Softmax for 3D CNNs: An Application in Facial Expression Recognition
T. Ayral
M. Pedersoli
Simon L Bacon
Eric Granger
CVBM
3DH
13
11
0
10 Nov 2020
Mutual Modality Learning for Video Action Classification
Stepan Alekseevich Komkov
Maksim Dzabraev
Aleksandr Petiushko
27
9
0
04 Nov 2020
PV-NAS: Practical Neural Architecture Search for Video Recognition
Zihao Wang
Chen Lin
Lu Sheng
Junjie Yan
Jing Shao
ViT
14
7
0
02 Nov 2020
Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning
L. Tao
Xueting Wang
T. Yamasaki
VLM
SSL
23
14
0
29 Oct 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Yikang Shen
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
23
95
0
22 Oct 2020
Pose And Joint-Aware Action Recognition
Anshul B. Shah
Shlok Kumar Mishra
Ankan Bansal
Jun-Cheng Chen
Ramalingam Chellappa
Abhinav Shrivastava
39
33
0
16 Oct 2020
Back to the Future: Cycle Encoding Prediction for Self-supervised Contrastive Video Representation Learning
Xinyu Yang
Majid Mirmehdi
T. Burghardt
27
4
0
14 Oct 2020
Boosting Continuous Sign Language Recognition via Cross Modality Augmentation
Junfu Pu
Wen-gang Zhou
Hezhen Hu
Houqiang Li
43
108
0
11 Oct 2020
Contrastive Representation Learning: A Framework and Review
Phúc H. Lê Khắc
Graham Healy
Alan F. Smeaton
SSL
AI4TS
178
686
0
10 Oct 2020
Support-set bottlenecks for video-text representation learning
Mandela Patrick
Po-Yao (Bernie) Huang
Yuki M. Asano
Florian Metze
Alexander G. Hauptmann
João Henriques
Andrea Vedaldi
22
244
0
06 Oct 2020
Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction
Giovanni Bellitto
Federica Proietto Salanitri
S. Palazzo
Francesco Rundo
Daniela Giordano
C. Spampinato
MDE
15
52
0
02 Oct 2020
PERF-Net: Pose Empowered RGB-Flow Net
Yinxiao Li
Zhichao Lu
Xuehan Xiong
Jonathan Huang
3DH
37
17
0
28 Sep 2020
On the spatiotemporal behavior in biology-mimicking computing systems
J. Végh
Ádám-József Berki
14
6
0
18 Sep 2020
Discovering Dynamic Salient Regions for Spatio-Temporal Graph Neural Networks
Iulia Duta
Andrei Liviu Nicolicioiu
Marius Leordeanu
26
6
0
17 Sep 2020
Multi-Label Activity Recognition using Activity-specific Features and Activity Correlations
Yanyi Zhang
Xinyu Li
I. Marsic
HAI
28
23
0
16 Sep 2020
Online Spatiotemporal Action Detection and Prediction via Causal Representations
Gurkirt Singh
3DPC
CML
21
0
0
31 Aug 2020
Previous
1
2
3
...
10
11
12
13
9
Next