Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 1,478 papers shown
Title
Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization
Yuanhao Zhai
Le Wang
Wei Tang
Qilin Zhang
Junsong Yuan
G. Hua
36
134
0
22 Oct 2020
A Short Note on the Kinetics-700-2020 Human Action Dataset
Lucas Smaira
João Carreira
Eric Noland
Ellen Clancy
Amy Wu
Andrew Zisserman
21
137
0
21 Oct 2020
TMT: A Transformer-based Modal Translator for Improving Multimodal Sequence Representations in Audio Visual Scene-aware Dialog
Wubo Li
Dongwei Jiang
Wei Zou
Xiangang Li
23
6
0
21 Oct 2020
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
Hung Le
Doyen Sahoo
Nancy F. Chen
Guosheng Lin
52
30
0
20 Oct 2020
Unsupervised Domain Adaptation for Spatio-Temporal Action Localization
Nakul Agarwal
Yi-Ting Chen
Behzad Dariush
Ming-Hsuan Yang
27
8
0
19 Oct 2020
Hierarchical Autoregressive Modeling for Neural Video Compression
Ruihan Yang
Yibo Yang
Joseph Marino
Stephan Mandt
BDL
VGen
112
46
0
19 Oct 2020
Pose And Joint-Aware Action Recognition
Anshul B. Shah
Shlok Kumar Mishra
Ankan Bansal
Jun-Cheng Chen
Ramalingam Chellappa
Abhinav Shrivastava
44
33
0
16 Oct 2020
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
33
72
0
15 Oct 2020
DORi: Discovering Object Relationship for Moment Localization of a Natural-Language Query in Video
Cristian Rodriguez-Opazo
Edison Marrese-Taylor
Basura Fernando
Hongdong Li
Stephen Gould
142
11
0
13 Oct 2020
Boosting Continuous Sign Language Recognition via Cross Modality Augmentation
Junfu Pu
Wen-gang Zhou
Hezhen Hu
Houqiang Li
43
108
0
11 Oct 2020
Watch, read and lookup: learning to spot signs from multiple supervisors
Liliane Momeni
Gül Varol
Samuel Albanie
Triantafyllos Afouras
Andrew Zisserman
36
43
0
08 Oct 2020
A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition
Ayush Srivastava
Oshin Dutta
A. Prathosh
Sumeet Agarwal
Jigyasa Gupta
17
8
0
03 Oct 2020
RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation
Míriam Bellver
Carles Ventura
Carina Silberer
Ioannis V. Kazakos
Jordi Torres
Xavier Giró-i-Nieto
VOS
29
32
0
01 Oct 2020
PERF-Net: Pose Empowered RGB-Flow Net
Yinxiao Li
Zhichao Lu
Xuehan Xiong
Jonathan Huang
3DH
40
17
0
28 Sep 2020
Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion
Jinpeng Wang
Yuting Gao
Ke Li
Jianguo Hu
Xinyang Jiang
Xiao-Wei Guo
Rongrong Ji
Xing Sun
40
62
0
12 Sep 2020
Defending Against Multiple and Unforeseen Adversarial Videos
Shao-Yuan Lo
Vishal M. Patel
AAML
31
23
0
11 Sep 2020
HAA500: Human-Centric Atomic Action Dataset with Curated Videos
Jihoon Chung
Cheng-hsin Wuu
Hsuan-ru Yang
Yu-Wing Tai
Chi-Keung Tang
23
43
0
11 Sep 2020
Dual Encoding for Video Retrieval by Text
Jianfeng Dong
Xirong Li
Chaoxi Xu
Xun Yang
Gang Yang
Xun Wang
Meng Wang
24
2
0
10 Sep 2020
Learning from Multiple Datasets with Heterogeneous and Partial Labels for Universal Lesion Detection in CT
K. Yan
Jinzheng Cai
Youjing Zheng
Adam P. Harrison
D. Jin
Youbao Tang
Yuxing Tang
Lingyun Huang
Jing Xiao
Le Lu
47
84
0
05 Sep 2020
Video Moment Retrieval via Natural Language Queries
Xinli Yu
Mohsen Malmir
C. He
Yue Liu
Rex Wu
22
1
0
04 Sep 2020
Deep Volumetric Universal Lesion Detection using Light-Weight Pseudo 3D Convolution and Surface Point Regression
Jinzheng Cai
K. Yan
Chi-Tung Cheng
Jing Xiao
Chien-Hung Liao
Le Lu
Adam P. Harrison
3DPC
MedIm
46
19
0
30 Aug 2020
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis
J. Ortega
Neslihan Köse
P. Cañas
Min-An Chao
A. Unnervik
Marcos Nieto
Oihana Otaegui
L. Salgado
27
91
0
27 Aug 2020
Deep Spatial Transformation for Pose-Guided Person Image Generation and Animation
Yurui Ren
Ge Li
Shan Liu
Thomas H. Li
3DH
31
65
0
27 Aug 2020
Making a Case for 3D Convolutions for Object Segmentation in Videos
Sabarinath Mahadevan
A. Athar
Aljosa Osep
Sebastian Hennen
Laura Leal-Taixé
Bastian Leibe
VOS
21
87
0
26 Aug 2020
Effective Action Recognition with Embedded Key Point Shifts
Haozhi Cao
Yuecong Xu
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
15
7
0
26 Aug 2020
In-Home Daily-Life Captioning Using Radio Signals
Lijie Fan
Tianhong Li
Yuan. Yuan
Dina Katabi
45
47
0
25 Aug 2020
Biased Mixtures Of Experts: Enabling Computer Vision Inference Under Data Transfer Limitations
Alhabib Abbas
Y. Andreopoulos
MoE
19
18
0
21 Aug 2020
Accuracy and Performance Comparison of Video Action Recognition Approaches
Matthew Hutchinson
S. Samsi
William Arcand
David Bestor
Bill Bergeron
...
Andrew Prout
Antonio Rosa
Albert Reuther
Charles Yee
V. Gadepally
14
5
0
20 Aug 2020
AssembleNet++: Assembling Modality Representations via Attention Connections
Michael S. Ryoo
A. Piergiovanni
Juhana Kangaspunta
A. Angelova
15
44
0
18 Aug 2020
Deep Domain Adaptation for Ordinal Regression of Pain Intensity Estimation Using Weakly-Labelled Videos
R Gnana Praveen
Eric Granger
P. Cardinal
31
20
0
13 Aug 2020
Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Ying Cheng
Ruize Wang
Zhihao Pan
Rui Feng
Yuejie Zhang
SSL
36
106
0
13 Aug 2020
Sharp Multiple Instance Learning for DeepFake Video Detection
Xiaodan Li
Yining Lang
YueFeng Chen
Xiaofeng Mao
Yuan He
Shuhui Wang
Hui Xue
Quan Lu
AAML
35
171
0
11 Aug 2020
Vision Meets Wireless Positioning: Effective Person Re-identification with Recurrent Context Propagation
Yiheng Liu
Wen-gang Zhou
Mao Xi
Sanjing Shen
Houqiang Li
28
8
0
10 Aug 2020
2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors Challenges: An Efficient Optical Flow Stream Guided Framework
Haoyu Chen
Zitong Yu
Xin Liu
Wei Peng
Yoon Lee
Guoying Zhao
3DPC
31
4
0
10 Aug 2020
Spatiotemporal Contrastive Video Representation Learning
Rui Qian
Tianjian Meng
Boqing Gong
Ming-Hsuan Yang
Haoran Wang
Serge J. Belongie
Huayu Chen
SSL
AI4TS
41
493
0
09 Aug 2020
A Unified Framework for Shot Type Classification Based on Subject Centric Lens
Anyi Rao
Jiaze Wang
Linning Xu
Xuekun Jiang
Qingqiu Huang
Bolei Zhou
Dahua Lin
28
61
0
08 Aug 2020
PAN: Towards Fast Action Recognition via Learning Persistence of Appearance
Can Zhang
Yuexian Zou
Guang Chen
Lei Gan
15
39
0
08 Aug 2020
Self-supervised Video Representation Learning Using Inter-intra Contrastive Framework
Li Tao
Xueting Wang
T. Yamasaki
SSL
25
106
0
06 Aug 2020
Boundary Content Graph Neural Network for Temporal Action Proposal Generation
Y. Bai
Yingying Wang
Yunhai Tong
Yang Yang
Qiyue Liu
Junhui Liu
33
161
0
04 Aug 2020
Jointly Cross- and Self-Modal Graph Attention Network for Query-Based Moment Localization
Daizong Liu
Xiaoye Qu
Xiao-Yang Liu
Jianfeng Dong
Pan Zhou
Zichuan Xu
33
129
0
04 Aug 2020
Late Temporal Modeling in 3D CNN Architectures with BERT for Action Recognition
M. E. Kalfaoglu
Sinan Kalkan
A. Aydin Alatan
3DPC
39
140
0
03 Aug 2020
AUTSL: A Large Scale Multi-modal Turkish Sign Language Dataset and Baseline Methods
Ozge Mercanoglu Sincan
H. Keles
SLR
30
168
0
03 Aug 2020
The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)
Samuel Albanie
Yang Liu
Arsha Nagrani
Antoine Miech
Ernesto Coto
...
Kaixu Cui
Hui Liu
Chen Wang
Yudong Jiang
Xiaoshuai Hao
34
9
0
03 Aug 2020
A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises
S. Kevin Zhou
H. Greenspan
Christos Davatzikos
James S. Duncan
Bram van Ginneken
A. Madabhushi
Jerry L. Prince
Daniel Rueckert
Ronald M. Summers
58
627
0
02 Aug 2020
Hierarchical Action Classification with Network Pruning
Mahdi Davoodikakhki
KangKang Yin
34
19
0
30 Jul 2020
Learning Video Representations from Textual Web Supervision
Jonathan C. Stroud
Zhichao Lu
Chen Sun
Jia Deng
Rahul Sukthankar
Cordelia Schmid
David A. Ross
SSL
40
48
0
29 Jul 2020
Enriching Video Captions With Contextual Text
Philipp Rimle
Pelin Dogan
Markus Gross
30
3
0
29 Jul 2020
Approximated Bilinear Modules for Temporal Modeling
Xinqi Zhu
Chang Xu
Langwen Hui
Cewu Lu
Dacheng Tao
25
23
0
25 Jul 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
37
37
0
25 Jul 2020
AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification
Xiaofang Wang
Xuehan Xiong
Maxim Neumann
A. Piergiovanni
Michael S. Ryoo
A. Angelova
Kris Kitani
Wei Hua
24
51
0
23 Jul 2020
Previous
1
2
3
...
21
22
23
...
28
29
30
Next