Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,647 papers shown
Title
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Jia Deng
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
113
48
0
29 Jul 2020
Enriching Video Captions With Contextual Text
Philipp Rimle
Pelin Dogan
Markus Gross
59
3
0
29 Jul 2020
3D Neural Network for Lung Cancer Risk Prediction on CT Volumes
Daniel Korat
14
0
0
25 Jul 2020
Approximated Bilinear Modules for Temporal Modeling
Xinqi Zhu
Chang Xu
Langwen Hui
Cewu Lu
Dacheng Tao
70
24
0
25 Jul 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
71
37
0
25 Jul 2020
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
Xiaofang Wang
Xuehan Xiong
Maxim Neumann
A. Piergiovanni
Michael S. Ryoo
A. Angelova
Kris Kitani
Wei Hua
103
51
0
23 Jul 2020
SBAT: Video Captioning with Sparse Boundary-Aware Transformer
Tao Jin
Siyu Huang
Ming Chen
Yingming Li
Zhongfei Zhang
111
56
0
23 Jul 2020
Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos
Yuan Tian
Guangtao Zhai
Zhiyong Gao
35
0
0
22 Jul 2020
Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition
Sudhakar Kumawat
Manisha Verma
Yuta Nakashima
Shanmuganathan Raman
204
44
0
22 Jul 2020
Rethinking CNN Models for Audio Classification
Kamalesh Palanisamy
Dipika Singhania
Angela Yao
SSL
83
146
0
22 Jul 2020
Creating a Large-scale Synthetic Dataset for Human Activity Recognition
Ollie Matthews
Koki Ryu
Tarun Srivastava
63
6
0
21 Jul 2020
Directional Temporal Modeling for Action Recognition
Xinyu Li
Bing Shuai
Joseph Tighe
65
42
0
21 Jul 2020
PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
Saining Xie
Jiatao Gu
Demi Guo
C. Qi
Leonidas Guibas
Or Litany
3DPC
250
648
0
21 Jul 2020
Foley Music: Learning to Generate Music from Videos
Chuang Gan
Deng Huang
Peihao Chen
J. Tenenbaum
Antonio Torralba
VGen
75
139
0
21 Jul 2020
Recurrent Exposure Generation for Low-Light Face Detection
Jinxiu Liang
Jingwen Wang
Yuhui Quan
Tianyi Chen
Jiaying Liu
Haibin Ling
Yong-mei Xu
CVBM
108
67
0
21 Jul 2020
MovieNet: A Holistic Dataset for Movie Understanding
Qingqiu Huang
Yu Xiong
Anyi Rao
Jiaze Wang
Dahua Lin
VGen
111
244
0
21 Jul 2020
Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos
Anurag Arnab
Chen Sun
Arsha Nagrani
Cordelia Schmid
71
25
0
21 Jul 2020
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
608
612
0
21 Jul 2020
Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing
Yapeng Tian
Dingzeyu Li
Chenliang Xu
135
185
0
21 Jul 2020
Hierarchical Contrastive Motion Learning for Video Action Recognition
Xitong Yang
Xiaodong Yang
Sifei Liu
Deqing Sun
L. Davis
Jan Kautz
SSL
110
13
0
20 Jul 2020
Learning Joint Spatial-Temporal Transformations for Video Inpainting
Yanhong Zeng
Jianlong Fu
Hongyang Chao
ViT
114
294
0
20 Jul 2020
Knowledge Graph Extraction from Videos
Louis Mahon
Eleonora Giunchiglia
Bowen Li
Thomas Lukasiewicz
52
20
0
20 Jul 2020
MotionSqueeze: Neural Motion Feature Learning for Video Understanding
Heeseung Kwon
Manjin Kim
Suha Kwak
Minsu Cho
FAtt
103
128
0
20 Jul 2020
Multimodal Dialogue State Tracking By QA Approach with Data Augmentation
Xiangyang Mou
Brandyn Sigouin
Ian Steenstra
Hui Su
52
9
0
20 Jul 2020
Context-Aware RCNN: A Baseline for Action Detection in Videos
Jianchao Wu
Zhanghui Kuang
Limin Wang
Wayne Zhang
Gangshan Wu
137
80
0
20 Jul 2020
RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices
Wei Niu
Mengshu Sun
Zechao Li
Jou-An Chen
Jiexiong Guan
Xipeng Shen
Yanzhi Wang
Sijia Liu
Xue Lin
Bin Ren
MQ
64
12
0
20 Jul 2020
MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection
Fa-Ting Hong
Xuanteng Huang
Weihong Li
Weishi Zheng
73
62
0
20 Jul 2020
E
2
^2
2
Net: An Edge Enhanced Network for Accurate Liver and Tumor Segmentation on CT Scans
Youbao Tang
Yuxing Tang
Yingying Zhu
Jing Xiao
Ronald M. Summers
MedIm
79
53
0
19 Jul 2020
Social Adaptive Module for Weakly-supervised Group Activity Recognition
Rui Yan
Lingxi Xie
Jinhui Tang
Xiangbo Shu
Qi Tian
68
87
0
18 Jul 2020
Learning to Discretely Compose Reasoning Module Networks for Video Captioning
Ganchao Tan
Daqing Liu
Meng Wang
Zhengjun Zha
LRM
86
74
0
17 Jul 2020
Region-based Non-local Operation for Video Classification
Guoxi Huang
A. Bors
83
11
0
17 Jul 2020
Visual Relation Grounding in Videos
Junbin Xiao
Xindi Shang
Xun Yang
Sheng Tang
Tat-Seng Chua
80
40
0
17 Jul 2020
Appearance-Preserving 3D Convolution for Video-based Person Re-identification
Xinqian Gu
Hong Chang
Bingpeng Ma
Hongkai Zhang
Xilin Chen
3DH
3DPC
81
138
0
16 Jul 2020
Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling
Xuesong Niu
Zitong Yu
Hu Han
Xiaobai Li
Shiguang Shan
Guoying Zhao
79
186
0
16 Jul 2020
Challenge report:VIPriors Action Recognition Challenge
Zhipeng Luo
Dawei Xu
Zhiguang Zhang
46
2
0
16 Jul 2020
Temporal Distinct Representation Learning for Action Recognition
Junwu Weng
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xudong Jiang
Junsong Yuan
76
26
0
15 Jul 2020
TinyVIRAT: Low-resolution Video Action Recognition
Ugur Demir
Yogesh S Rawat
M. Shah
63
38
0
14 Jul 2020
COBE: Contextualized Object Embeddings from Narrated Instructional Video
Gedas Bertasius
Lorenzo Torresani
70
24
0
14 Jul 2020
Learning Semantics-enriched Representation via Self-discovery, Self-classification, and Self-restoration
F. Haghighi
M. Taher
Zongwei Zhou
Michael B. Gotway
Jianming Liang
MedIm
81
65
0
14 Jul 2020
Alleviating Over-segmentation Errors by Detecting Action Boundaries
Yuchi Ishikawa
Seito Kasai
Y. Aoki
Hirokatsu Kataoka
77
140
0
14 Jul 2020
Socially and Contextually Aware Human Motion and Pose Forecasting
Vida Adeli
Ehsan Adeli
Ian Reid
Juan Carlos Niebles
Hamid Rezatofighi
3DH
71
82
0
14 Jul 2020
Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization
Kyle Min
Jason J. Corso
51
103
0
13 Jul 2020
IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos
Gyeongsik Moon
Heeseung Kwon
Kyoung Mu Lee
Minsu Cho
70
26
0
13 Jul 2020
Fusing Motion Patterns and Key Visual Information for Semantic Event Recognition in Basketball Videos
Lifang Wu
Zhou Yang
Qi Wang
Meng Jian
Boxuan Zhao
Junchi Yan
Chang Wen Chen
70
33
0
13 Jul 2020
Universal-to-Specific Framework for Complex Action Recognition
Peisen Zhao
Lingxi Xie
Ya Zhang
Qi Tian
60
9
0
13 Jul 2020
Locality Guided Neural Networks for Explainable Artificial Intelligence
Randy Tan
N. Khan
L. Guan
33
8
0
12 Jul 2020
Representation Learning via Adversarially-Contrastive Optimal Transport
A. Cherian
Shuchin Aeron
OT
43
7
0
11 Jul 2020
Lightweight Modules for Efficient Deep Learning based Image Restoration
A. Lahiri
Sourav Bairagya
Sutanu Bera
Siddhant Haldar
P. Biswas
SupR
84
36
0
11 Jul 2020
Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching
Xuhua Huang
Jiarui Xu
Yu-Wing Tai
Chi-Keung Tang
VOS
137
67
0
11 Jul 2020
AViD Dataset: Anonymized Videos from Diverse Countries
A. Piergiovanni
Michael S. Ryoo
153
36
0
10 Jul 2020
Previous
1
2
3
...
56
57
58
...
71
72
73
Next