Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1711.09577
Cited By
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
27 November 2017
Kensho Hara
Hirokatsu Kataoka
Y. Satoh
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?"
50 / 287 papers shown
Title
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living
Srijan Das
Rui Dai
Di Yang
F. Brémond
ViT
43
67
0
17 May 2021
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Yikang Shen
Chun-Fu Chen
Quanfu Fan
Ximeng Sun
Kate Saenko
A. Oliva
Rogerio Feris
36
47
0
11 May 2021
Adaptive Focus for Efficient Video Recognition
Yulin Wang
Zhaoxi Chen
Haojun Jiang
Shiji Song
Yizeng Han
Gao Huang
45
98
0
07 May 2021
CoCon: Cooperative-Contrastive Learning
Nishant Rai
Ehsan Adeli
Kuan-Hui Lee
Adrien Gaidon
Juan Carlos Niebles
SSL
20
18
0
30 Apr 2021
Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering
Jungin Park
Jiyoung Lee
Kwanghoon Sohn
167
100
0
29 Apr 2021
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shixing Chen
Xiaohan Nie
David D. Fan
Dongqing Zhang
Vimal Bhat
Raffay Hamid
SSL
27
62
0
28 Apr 2021
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
148
193
0
23 Apr 2021
Graph-based Facial Affect Analysis: A Review
Yang Liu
Xingming Zhang
Yante Li
Jinzhao Zhou
Xin-hui Li
Guoying Zhao
CVBM
46
24
0
29 Mar 2021
No frame left behind: Full Video Action Recognition
X. Liu
S. Pintea
F. Karimi Nejadasl
Olaf Booij
Jan van Gemert
21
41
0
29 Mar 2021
Skeleton Aware Multi-modal Sign Language Recognition
Songyao Jiang
Bin Sun
Lichen Wang
Yue Bai
Kunpeng Li
Y. Fu
SLR
33
166
0
16 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
39
165
0
11 Mar 2021
Time and Frequency Network for Human Action Detection in Videos
Changhai Li
Huawei Chen
Jingqing Lu
Yang Huang
Yingying Liu
3DH
AI4TS
13
2
0
08 Mar 2021
Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
27
15
0
16 Feb 2021
VA-RED
2
^2
2
: Video Adaptive Redundancy Reduction
Bowen Pan
Yikang Shen
Camilo Luciano Fosco
Chung-Ching Lin
A. Andonian
Yue Meng
Kate Saenko
A. Oliva
Rogerio Feris
15
19
0
15 Feb 2021
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning
Mamshad Nayeem Rizve
Kevin Duarte
Yogesh S Rawat
M. Shah
247
509
0
15 Jan 2021
FakeBuster: A DeepFakes Detection Tool for Video Conferencing Scenarios
V. Mehta
Parul Gupta
Ramanathan Subramanian
Abhinav Dhall
CVBM
33
22
0
09 Jan 2021
Uncertainty-sensitive Activity Recognition: a Reliability Benchmark and the CARING Models
Alina Roitberg
Monica Haurilet
Manuel Martínez
Rainer Stiefelhagen
UQCV
39
6
0
02 Jan 2021
DVD: A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue
Hung Le
Chinnadhurai Sankar
Seungwhan Moon
Ahmad Beirami
A. Geramifard
Satwik Kottur
VGen
33
18
0
01 Jan 2021
A Comprehensive Study of Deep Video Action Recognition
Yi Zhu
Xinyu Li
Chunhui Liu
Mohammadreza Zolfaghari
Yuanjun Xiong
Chongruo Wu
Zhi-Li Zhang
Joseph Tighe
R. Manmatha
Mu Li
VLM
AI4TS
38
185
0
11 Dec 2020
Relational Learning for Skill Preconditions
Mohit Sharma
Oliver Kroemer
SSL
19
18
0
03 Dec 2020
A Comprehensive Review on Recent Methods and Challenges of Video Description
Ashutosh Kumar Singh
Thoudam Doren Singh
Sivaji Bandyopadhyay
3DV
VLM
19
5
0
30 Nov 2020
Hierarchically Decoupled Spatial-Temporal Contrast for Self-supervised Video Representation Learning
Zehua Zhang
David J. Crandall
AI4TS
SSL
28
23
0
23 Nov 2020
We don't Need Thousand Proposals
:
\colon
:
Single Shot Actor-Action Detection in Videos
A. J. Rana
Yogesh S Rawat
ViT
13
11
0
22 Nov 2020
Whose hand is this? Person Identification from Egocentric Hand Gestures
Satoshi Tsutsui
Yanwei Fu
David J. Crandall
EgoV
16
7
0
17 Nov 2020
Temporal Stochastic Softmax for 3D CNNs: An Application in Facial Expression Recognition
T. Ayral
M. Pedersoli
Simon L Bacon
Eric Granger
CVBM
3DH
13
11
0
10 Nov 2020
Multi-Temporal Convolutions for Human Action Recognition in Videos
Alexandros Stergiou
R. Poppe
24
1
0
08 Nov 2020
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Simon Ging
Mohammadreza Zolfaghari
Hamed Pirsiavash
Thomas Brox
ViT
CLIP
31
168
0
01 Nov 2020
Pretext-Contrastive Learning: Toward Good Practices in Self-supervised Video Representation Leaning
L. Tao
Xueting Wang
T. Yamasaki
VLM
SSL
23
14
0
29 Oct 2020
ElderSim: A Synthetic Data Generation Platform for Human Action Recognition in Eldercare Applications
Hochul Hwang
Cheongjae Jang
Geonwoo Park
Junghyun Cho
Ig-Jae Kim
32
70
0
28 Oct 2020
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition
Chun-Fu Chen
Yikang Shen
K. Ramakrishnan
Rogerio Feris
J. M. Cohn
A. Oliva
Quanfu Fan
23
95
0
22 Oct 2020
TMT: A Transformer-based Modal Translator for Improving Multimodal Sequence Representations in Audio Visual Scene-aware Dialog
Wubo Li
Dongwei Jiang
Wei Zou
Xiangang Li
23
6
0
21 Oct 2020
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
Hung Le
Doyen Sahoo
Nancy F. Chen
Guosheng Lin
47
30
0
20 Oct 2020
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
33
72
0
15 Oct 2020
Concentrated Multi-Grained Multi-Attention Network for Video Based Person Re-Identification
Panwen Hu
Jiazhen Liu
Rui Huang
28
2
0
28 Sep 2020
MultAV: Multiplicative Adversarial Videos
Shao-Yuan Lo
Vishal M. Patel
AAML
26
8
0
17 Sep 2020
Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion
Jinpeng Wang
Yuting Gao
Ke Li
Jianguo Hu
Xinyang Jiang
Xiao-Wei Guo
Rongrong Ji
Xing Sun
37
62
0
12 Sep 2020
Defending Against Multiple and Unforeseen Adversarial Videos
Shao-Yuan Lo
Vishal M. Patel
AAML
31
23
0
11 Sep 2020
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis
J. Ortega
Neslihan Köse
P. Cañas
Min-An Chao
A. Unnervik
Marcos Nieto
Oihana Otaegui
L. Salgado
24
91
0
27 Aug 2020
Making a Case for 3D Convolutions for Object Segmentation in Videos
Sabarinath Mahadevan
A. Athar
Aljosa Osep
Sebastian Hennen
Laura Leal-Taixé
Bastian Leibe
VOS
21
86
0
26 Aug 2020
Effective Action Recognition with Embedded Key Point Shifts
Haozhi Cao
Yuecong Xu
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
15
7
0
26 Aug 2020
Spatiotemporal Contrastive Video Representation Learning
Rui Qian
Tianjian Meng
Boqing Gong
Ming-Hsuan Yang
Haoran Wang
Serge J. Belongie
Huayu Chen
SSL
AI4TS
41
492
0
09 Aug 2020
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
M. E. Kalfaoglu
Sinan Kalkan
A. Aydin Alatan
3DPC
39
140
0
03 Aug 2020
HEU Emotion: A Large-scale Database for Multi-modal Emotion Recognition in the Wild
Jia Chen
Chenhui Wang
Ke-jun Wang
Chaoqun Yin
Cong Zhao
Tao Xu
Xinyi Zhang
Ziqiang Huang
Meichen Liu
Tao Yang
CVBM
17
39
0
24 Jul 2020
Hierarchical Contrastive Motion Learning for Video Action Recognition
Xitong Yang
Xiaodong Yang
Sifei Liu
Deqing Sun
L. Davis
Jan Kautz
SSL
35
13
0
20 Jul 2020
MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection
Fa-Ting Hong
Xuanteng Huang
Weihong Li
Weishi Zheng
8
61
0
20 Jul 2020
TinyVIRAT: Low-resolution Video Action Recognition
Ugur Demir
Yogesh S Rawat
M. Shah
33
36
0
14 Jul 2020
Video-Grounded Dialogues with Pretrained Generation Language Models
Hung Le
Guosheng Lin
34
28
0
27 Jun 2020
Comprehensive Information Integration Modeling Framework for Video Titling
Shengyu Zhang
Ziqi Tan
Jin Yu
Zhou Zhao
Kun Kuang
Tan Jiang
Jingren Zhou
Hongxia Yang
Fei Wu
31
40
0
24 Jun 2020
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Andrew Rouditchenko
Angie Boggust
David Harwath
Brian Chen
D. Joshi
...
Rogerio Feris
Brian Kingsbury
M. Picheny
Antonio Torralba
James R. Glass
SSL
22
141
0
16 Jun 2020
Learn to cycle: Time-consistent feature discovery for action recognition
Alexandros Stergiou
R. Poppe
22
23
0
15 Jun 2020
Previous
1
2
3
4
5
6
Next