Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.11248
Cited By
A Closer Look at Spatiotemporal Convolutions for Action Recognition
30 November 2017
Du Tran
Heng Wang
Lorenzo Torresani
Jamie Ray
Yann LeCun
Manohar Paluri
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Closer Look at Spatiotemporal Convolutions for Action Recognition"
50 / 477 papers shown
Title
Knowledge Distillation for Action Anticipation via Label Smoothing
Guglielmo Camporese
Pasquale Coscia
Antonino Furnari
G. Farinella
Lamberto Ballan
EgoV
40
36
0
16 Apr 2020
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?
Hirokatsu Kataoka
Tenga Wakamiya
Kensho Hara
Y. Satoh
3DPC
31
87
0
10 Apr 2020
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
73
1,001
0
09 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
25
439
0
03 Apr 2020
Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention
Juan-Manuel Perez-Rua
Brais Martínez
Xiatian Zhu
Antoine Toisoul
Victor Escorcia
Tao Xiang
48
19
0
02 Apr 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
33
54
0
30 Mar 2020
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Lu Wang
Dongxue Liang
Xiao-Lei Yin
Jing Qiu
Zhi-Yun Yang
Jun-Hui Xing
Jian-Zeng Dong
Zhao-Yuan Ma
MedIm
21
0
0
26 Mar 2020
MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps
Pengxiang Wu
Siheng Chen
Dimitris N. Metaxas
3DPC
14
154
0
15 Mar 2020
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
137
127
0
03 Mar 2020
Evolving Losses for Unsupervised Video Representation Learning
A. Piergiovanni
A. Angelova
Michael S. Ryoo
SSL
27
138
0
26 Feb 2020
A Survey on 3D Skeleton-Based Action Recognition Using Learning Method
Bin Ren
Mengyuan Liu
Runwei Ding
Hong Liu
27
121
0
14 Feb 2020
Over-the-Air Adversarial Flickering Attacks against Video Recognition Networks
Roi Pony
I. Naeh
Shie Mannor
AAML
18
51
0
12 Feb 2020
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Yi Yang
Shilei Wen
24
35
0
09 Feb 2020
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
197
207
0
23 Jan 2020
Weakly Supervised Temporal Action Localization Using Deep Metric Learning
Ashraful Islam
Richard J. Radke
27
46
0
21 Jan 2020
Temporal Interlacing Network
Hao Shao
Shengju Qian
Yu Liu
29
92
0
17 Jan 2020
Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning
Dezhao Luo
Chang-rui Liu
Yu Zhou
Dongbao Yang
Can Ma
QiXiang Ye
Weiping Wang
SSL
25
160
0
02 Jan 2020
Lower Dimensional Kernels for Video Discriminators
Emmanuel Kahembwe
S. Ramamoorthy
24
50
0
18 Dec 2019
Listen to Look: Action Recognition by Previewing Audio
Ruohan Gao
Tae-Hyun Oh
Kristen Grauman
Lorenzo Torresani
VLM
29
251
0
10 Dec 2019
Temporal Factorization of 3D Convolutional Kernels
Gabrielle Ras
L. Ambrogioni
Umut Güçlü
Marcel van Gerven
14
1
0
09 Dec 2019
A Multigrid Method for Efficiently Training Video Models
Chaoxia Wu
Ross B. Girshick
Kaiming He
Christoph Feichtenhofer
Philipp Krahenbuhl
18
94
0
02 Dec 2019
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation
Quanfu Fan
Chun-Fu Chen
Hilde Kuehne
Marco Pistoia
David D. Cox
32
126
0
02 Dec 2019
STConvS2S: Spatiotemporal Convolutional Sequence to Sequence Network for Weather Forecasting
Rafaela C. Nascimento
Y. M. Souto
Eduardo S. Ogasawara
Fábio Porto
Eduardo Bezerra
AI4TS
17
82
0
30 Nov 2019
Self-Supervised Learning by Cross-Modal Audio-Video Clustering
Humam Alwassel
D. Mahajan
Bruno Korbar
Lorenzo Torresani
Guohao Li
Du Tran
SSL
42
428
0
28 Nov 2019
MMTM: Multimodal Transfer Module for CNN Fusion
Hamid Reza Vaezi Joze
Amirreza Shaban
Michael L. Iuzzolino
K. Koishida
18
277
0
20 Nov 2019
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization
Okan Kopuklu
Xiangyu Wei
Gerhard Rigoll
28
143
0
15 Nov 2019
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos
Yitian Yuan
Lin Ma
Jingwen Wang
Wei Liu
Wenwu Zhu
30
242
0
31 Oct 2019
Comprehensive Video Understanding: Video summarization with content-based video recommender design
Yudong Jiang
Kaixu Cui
B. Peng
Changliang Xu
BDL
14
28
0
30 Oct 2019
TrajectoryNet: a new spatio-temporal feature learning network for human motion prediction
Xiaoli Liu
Jianqin Yin
Jin Liu
Pengxiang Ding
Jun Liu
Huaping Liu
3DH
27
11
0
15 Oct 2019
CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning
Rohit Girdhar
Deva Ramanan
19
176
0
10 Oct 2019
Graph-based Spatial-temporal Feature Learning for Neuromorphic Vision Sensing
Yin Bi
Aaron Chadha
Alhabib Abbas
Eirina Bourtsoulatze
Y. Andreopoulos
25
26
0
08 Oct 2019
CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing
Kevin Duarte
Yogesh S Rawat
M. Shah
VOS
14
68
0
30 Sep 2019
Spatio-Temporal FAST 3D Convolutions for Human Action Recognition
Alexandros Stergiou
R. Poppe
3DH
20
19
0
30 Sep 2019
Grouped Spatial-Temporal Aggregation for Efficient Action Recognition
Chenxu Luo
Alan Yuille
130
150
0
28 Sep 2019
Exploring Temporal Differences in 3D Convolutional Neural Networks
Gagan Kanojia
Sudhakar Kumawat
Shanmuganathan Raman
3DPC
AI4TS
21
3
0
07 Sep 2019
PISEP^2: Pseudo Image Sequence Evolution based 3D Pose Prediction
Xiaoli Liu
Jianqin Yin
Huaping Liu
Yilong Yin
3DH
32
7
0
04 Sep 2019
Explainable Video Action Reasoning via Prior Knowledge and State Transitions
Tao Zhuo
Zhiyong Cheng
Peng Zhang
Yongkang Wong
Mohan Kankanhalli
FAtt
33
60
0
28 Aug 2019
Action recognition with spatial-temporal discriminative filter banks
Brais Martínez
Davide Modolo
Yuanjun Xiong
Joseph Tighe
18
66
0
20 Aug 2019
Enhanced 3D convolutional networks for crowd counting
Zhikang Zou
Huiliang Shao
Xiaoye Qu
Wei Wei
Pan Zhou
33
38
0
12 Aug 2019
SF-Net: Structured Feature Network for Continuous Sign Language Recognition
Zhaoyang Yang
Zhenmei Shi
Xiaoyong Shen
Yu-Wing Tai
SLR
27
63
0
04 Aug 2019
Use What You Have: Video Retrieval Using Representations From Collaborative Experts
Yang Liu
Samuel Albanie
Arsha Nagrani
Andrew Zisserman
36
387
0
31 Jul 2019
Remote Heart Rate Measurement from Highly Compressed Facial Videos: an End-to-end Deep Learning Solution with Video Enhancement
Zitong Yu
Wei Peng
Xiaobai Li
Xiaopeng Hong
Guoying Zhao
32
267
0
27 Jul 2019
An Efficient 3D CNN for Action/Object Segmentation in Video
Rui Hou
Chong Chen
Rahul Sukthankar
M. Shah
24
27
0
21 Jul 2019
Only Time Can Tell: Discovering Temporal Data for Temporal Modeling
Laura Sevilla-Lara
Shengxin Cindy Zha
Zhicheng Yan
Vedanuj Goswami
Matt Feiszli
Lorenzo Torresani
50
75
0
19 Jul 2019
Video Action Recognition Via Neural Architecture Searching
Wei Peng
Xiaopeng Hong
Guoying Zhao
41
36
0
10 Jul 2019
Deformable Tube Network for Action Detection in Videos
Wei Li
Zehuan Yuan
Dashan Guo
Lei Huang
Xiangzhong Fang
Changhu Wang
ViT
MedIm
33
5
0
03 Jul 2019
Few-Shot Video Classification via Temporal Alignment
Kaidi Cao
Jingwei Ji
Zhangjie Cao
C. Chang
Juan Carlos Niebles
AI4TS
27
235
0
27 Jun 2019
HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-scale Point Clouds
Xiuye Gu
Yijie Wang
Chongruo Wu
Yong Jae Lee
Panqu Wang
3DPC
29
206
0
12 Jun 2019
Lightweight Network Architecture for Real-Time Action Recognition
Alexander Kozlov
Vadim Andronov
Y. Gritsenko
ViT
25
33
0
21 May 2019
STAR: A Concise Deep Learning Framework for Citywide Human Mobility Prediction
Hongnian Wang
Han Su
HAI
26
17
0
16 May 2019
Previous
1
2
3
...
10
8
9
Next