Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 1,475 papers shown
Title
Local-Global Video-Text Interactions for Temporal Grounding
Jonghwan Mun
Minsu Cho
Bohyung Han
36
267
0
16 Apr 2020
Towards Anomaly Detection in Dashcam Videos
S. Haresh
Sateesh Kumar
M. Zia
Quoc-Huy Tran
27
30
0
11 Apr 2020
Would Mega-scale Datasets Further Enhance Spatiotemporal 3D CNNs?
Hirokatsu Kataoka
Tenga Wakamiya
Kensho Hara
Y. Satoh
3DPC
31
87
0
10 Apr 2020
X3D: Expanding Architectures for Efficient Video Recognition
Christoph Feichtenhofer
78
1,004
0
09 Apr 2020
Dense Regression Network for Video Grounding
Runhao Zeng
Haoming Xu
Wenbing Huang
Peihao Chen
Mingkui Tan
Chuang Gan
22
284
0
07 Apr 2020
TEA: Temporal Excitation and Aggregation for Action Recognition
Yan-Ran Li
Bin Ji
Xintian Shi
Jianguo Zhang
Bin Kang
Limin Wang
ViT
42
439
0
03 Apr 2020
Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention
Juan-Manuel Perez-Rua
Brais Martínez
Xiatian Zhu
Antoine Toisoul
Victor Escorcia
Tao Xiang
48
19
0
02 Apr 2020
Temporal Accumulative Features for Sign Language Recognition
A. Kındıroglu
Ogulcan Özdemir
L. Akarun
SLR
16
18
0
02 Apr 2020
Learning Longterm Representations for Person Re-Identification Using Radio Signals
Lijie Fan
Tianhong Li
Rongyao Fang
Rumen Hristov
Yuan. Yuan
Dina Katabi
27
86
0
02 Apr 2020
Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning
Zhekun Luo
Devin Guillory
Baifeng Shi
Wei Ke
Fang Wan
Trevor Darrell
Huijuan Xu
19
119
0
31 Mar 2020
Explaining Motion Relevance for Activity Recognition in Video Deep Learning Models
Liam Hiley
Alun D. Preece
Y. Hicks
Supriyo Chakraborty
Prudhvi K. Gurram
Richard J. Tomsett
FAtt
25
15
0
31 Mar 2020
Long Short-Term Relation Networks for Video Action Detection
Dong Li
Ting Yao
Zhaofan Qiu
Houqiang Li
Tao Mei
12
22
0
31 Mar 2020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Boxiao Pan
Haoye Cai
De-An Huang
Kuan-Hui Lee
Adrien Gaidon
Ehsan Adeli
Juan Carlos Niebles
31
235
0
31 Mar 2020
Combining detection and tracking for human pose estimation in videos
Manchen Wang
Joseph Tighe
Davide Modolo
VOT
26
109
0
30 Mar 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
33
54
0
30 Mar 2020
Learning Interactions and Relationships between Movie Characters
Anna Kukleva
Makarand Tapaswi
Ivan Laptev
41
51
0
29 Mar 2020
Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection
Jie Chen
Zhiheng Li
Jiebo Luo
Chenliang Xu
27
13
0
29 Mar 2020
Omni-sourced Webly-supervised Learning for Video Recognition
Haodong Duan
Yue Zhao
Yuanjun Xiong
Wentao Liu
Dahua Lin
VLM
23
88
0
29 Mar 2020
Actor-Transformers for Group Activity Recognition
Kirill Gavrilyuk
Ryan Sanford
Mehrsan Javan
Cees G. M. Snoek
ViT
19
178
0
28 Mar 2020
Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-based Person Re-identification
Zhizheng Zhang
Cuiling Lan
Wenjun Zeng
Zhibo Chen
VOS
27
98
0
27 Mar 2020
Negative Margin Matters: Understanding Margin in Few-shot Classification
Bin Liu
Yue Cao
Yutong Lin
Qi Li
Zheng-Wei Zhang
Mingsheng Long
Han Hu
35
318
0
26 Mar 2020
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Lu Wang
Dongxue Liang
Xiao-Lei Yin
Jing Qiu
Zhi-Yun Yang
Jun-Hui Xing
Jian-Zeng Dong
Zhao-Yuan Ma
MedIm
21
0
0
26 Mar 2020
Learning Object Permanence from Video
Aviv Shamsian
Ofri Kleinfeld
Amir Globerson
Gal Chechik
SSL
47
31
0
23 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
135
189
0
19 Mar 2020
PIC: Permutation Invariant Convolution for Recognizing Long-range Activities
Noureldien Hussein
E. Gavves
A. Smeulders
VLM
31
13
0
18 Mar 2020
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
22
165
0
17 Mar 2020
On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location
O. Kayhan
Jan van Gemert
213
233
0
16 Mar 2020
SF-Net: Single-Frame Supervision for Temporal Action Localization
Fan Ma
Linchao Zhu
Yi Yang
Shengxin Cindy Zha
Gourab Kundu
Matt Feiszli
Zheng Shou
18
140
0
15 Mar 2020
Interaction Graphs for Object Importance Estimation in On-road Driving Videos
Zehua Zhang
Ashish Tawari
Sujitha Martin
David J. Crandall
GNN
FAtt
17
23
0
12 Mar 2020
Visual Grounding in Video for Unsupervised Word Translation
Gunnar A. Sigurdsson
Jean-Baptiste Alayrac
Aida Nematzadeh
Lucas Smaira
Mateusz Malinowski
João Carreira
Phil Blunsom
Andrew Zisserman
VGen
27
49
0
11 Mar 2020
Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid Network
Jialin Gao
Zhixiang Shi
Jiani Li
Guanshuo Wang
Yufeng Yuan
Shiming Ge
Xiaoping Zhou
13
73
0
09 Mar 2020
Better Captioning with Sequence-Level Exploration
Jia Chen
Qin Jin
37
12
0
08 Mar 2020
Transferring Cross-domain Knowledge for Video Sign Language Recognition
Dongxu Li
Xin Yu
Chenchen Xu
L. Petersson
Hongdong Li
SLR
36
104
0
08 Mar 2020
Noise Estimation Using Density Estimation for Self-Supervised Multimodal Learning
Elad Amrani
Rami Ben-Ari
Daniel Rotman
A. Bronstein
17
121
0
06 Mar 2020
Detecting Attended Visual Targets in Video
Eunji Chong
Yongxin Wang
Nataniel Ruiz
James M. Rehg
199
112
0
05 Mar 2020
Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications
Biagio Brattoli
Joseph Tighe
Fedor Zhdanov
Pietro Perona
Krzysztof Chalupka
VLM
137
127
0
03 Mar 2020
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning
Shizhe Chen
Yida Zhao
Qin Jin
Qi Wu
48
310
0
01 Mar 2020
Joint 2D-3D Breast Cancer Classification
G. Liang
Xiaoqin Wang
Yu Zhang
Xin Xing
Hunter Blanton
Tawfiq Salem
Nathan Jacobs
31
39
0
27 Feb 2020
Evolving Losses for Unsupervised Video Representation Learning
A. Piergiovanni
A. Angelova
Michael S. Ryoo
SSL
27
138
0
26 Feb 2020
Fine-Grained Instance-Level Sketch-Based Video Retrieval
Peng Xu
Kun Liu
Tao Xiang
Timothy M. Hospedales
Zhanyu Ma
Jun Guo
Yi-Zhe Song
32
32
0
21 Feb 2020
Strength from Weakness: Fast Learning Using Weak Supervision
Joshua Robinson
Stefanie Jegelka
S. Sra
43
32
0
19 Feb 2020
Over-the-Air Adversarial Flickering Attacks against Video Recognition Networks
Roi Pony
I. Naeh
Shie Mannor
AAML
21
51
0
12 Feb 2020
An End-to-End Visual-Audio Attention Network for Emotion Recognition in User-Generated Videos
Sicheng Zhao
Yunsheng Ma
Yang Gu
Jufeng Yang
Tengfei Xing
Pengfei Xu
Runbo Hu
Hua Chai
Kurt Keutzer
19
98
0
12 Feb 2020
Dynamic Inference: A New Approach Toward Efficient Video Action Recognition
Wenhao Wu
Dongliang He
Xiao Tan
Shifeng Chen
Yi Yang
Shilei Wen
24
35
0
09 Feb 2020
Weakly-Supervised Multi-Person Action Recognition in 360
∘
^{\circ}
∘
Videos
Junnan Li
Jianquan Liu
Yongkang Wong
Shoji Nishimura
Mohan S. Kankanhalli
31
13
0
09 Feb 2020
Solving Raven's Progressive Matrices with Neural Networks
Tao Zhuo
Mohan S. Kankanhalli
27
26
0
05 Feb 2020
Action Graphs: Weakly-supervised Action Localization with Graph Convolution Networks
M. Rashid
Hedvig Kjellström
Yong Jae Lee
WSOL
GNN
19
46
0
04 Feb 2020
Interpreting video features: a comparison of 3D convolutional networks and convolutional LSTM networks
Joonatan Mänttäri
Sofia Broomé
John Folkesson
Hedvig Kjellström
FAtt
27
27
0
02 Feb 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
119
277
0
24 Jan 2020
Audiovisual SlowFast Networks for Video Recognition
Fanyi Xiao
Yong Jae Lee
Kristen Grauman
Jitendra Malik
Christoph Feichtenhofer
197
207
0
23 Jan 2020
Previous
1
2
3
...
23
24
25
...
28
29
30
Next