Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,647 papers shown
Title
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Taojiannan Yang
Sijie Zhu
Matías Mendieta
Pu Wang
Ravikumar Balakrishnan
Minwoo Lee
T. Han
M. Shah
Chong Chen
3DH
OOD
102
24
0
14 May 2021
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
Tianrui Hui
Shaofei Huang
Si Liu
Zihan Ding
Guanbin Li
Wenguan Wang
Jizhong Han
Fei Wang
79
49
0
14 May 2021
Video Corpus Moment Retrieval with Contrastive Learning
Hao Zhang
Aixin Sun
Wei Jing
Guoshun Nan
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
108
88
0
13 May 2021
Breaking Shortcut: Exploring Fully Convolutional Cycle-Consistency for Video Correspondence Learning
Yansong Tang
Zhenyu Jiang
Zhenda Xie
Yue Cao
Zheng Zhang
Philip Torr
Han Hu
116
6
0
12 May 2021
The DEVIL is in the Details: A Diagnostic Evaluation Benchmark for Video Inpainting
Ryan Szeto
Jason J. Corso
VGen
92
12
0
11 May 2021
Home Action Genome: Cooperative Compositional Action Understanding
Nishant Rai
Haofeng Chen
Jingwei Ji
Rishi Desai
Kazuki Kozuka
Shun Ishizaka
Ehsan Adeli
Juan Carlos Niebles
45
77
0
11 May 2021
Representation Learning via Global Temporal Alignment and Cycle-Consistency
Isma Hadji
Konstantinos G. Derpanis
Allan D. Jepson
AI4TS
145
55
0
11 May 2021
AdaMML: Adaptive Multi-Modal Learning for Efficient Video Recognition
Yikang Shen
Chun-Fu Chen
Quanfu Fan
Ximeng Sun
Kate Saenko
A. Oliva
Rogerio Feris
97
50
0
11 May 2021
ChaLearn LAP Large Scale Signer Independent Isolated Sign Language Recognition Challenge: Design, Results and Future Research
Ozge Mercanoglu
Julio C. S. Jacques Junior
Sergio Escalera
H. Keles
46
38
0
11 May 2021
Poisoning MorphNet for Clean-Label Backdoor Attack to Point Clouds
Guiyu Tian
Wenhao Jiang
Wei Liu
Yadong Mu
3DPC
AAML
63
14
0
11 May 2021
Learning Implicit Temporal Alignment for Few-shot Video Classification
Songyang Zhang
Jiale Zhou
Xuming He
AI4TS
88
45
0
11 May 2021
Stochastic Image-to-Video Synthesis using cINNs
Michael Dorkenwald
Timo Milbich
A. Blattmann
Robin Rombach
Konstantinos G. Derpanis
Bjorn Ommer
DiffM
VGen
115
55
0
10 May 2021
Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Mathew Monfort
SouYoung Jin
Alexander H. Liu
David Harwath
Rogerio Feris
James Glass
Aude Oliva
56
60
0
10 May 2021
Event-LSTM: An Unsupervised and Asynchronous Learning-based Representation for Event-based Data
Lakshmi Annamalai
Vignesh Ramanathan
Chetan Singh Thakur
69
15
0
10 May 2021
Action Shuffling for Weakly Supervised Temporal Localization
Xiaoyu Zhang
Haichao Shi
Changsheng Li
Xinchu Shi
WSOL
76
11
0
10 May 2021
Coupling Intent and Action for Pedestrian Crossing Behavior Prediction
Yu Yao
E. Atkins
Matthew Johnson-Roberson
Ram Vasudevan
Xiaoxiao Du
75
37
0
10 May 2021
Good Practices and A Strong Baseline for Traffic Anomaly Detection
Yuxiang Zhao
Wenhao Wu
Yue He
Yingying Li
Xiao Tan
Shifeng Chen
AI4TS
111
13
0
09 May 2021
Adaptive Focus for Efficient Video Recognition
Yulin Wang
Zhaoxi Chen
Haojun Jiang
Shiji Song
Yizeng Han
Gao Huang
106
100
0
07 May 2021
Human Object Interaction Detection using Two-Direction Spatial Enhancement and Exclusive Object Prior
Lu Liu
R. Tan
121
9
0
07 May 2021
Aligning Subtitles in Sign Language Videos
Hannah Bull
Triantafyllos Afouras
Gül Varol
Samuel Albanie
Liliane Momeni
Andrew Zisserman
SLR
50
30
0
06 May 2021
VideoLT: Large-scale Long-tailed Video Recognition
Xing Zhang
Zuxuan Wu
Zejia Weng
Huazhu Fu
Jingjing Chen
Yu-Gang Jiang
Larry S. Davis
114
42
0
06 May 2021
Unsupervised Visual Representation Learning by Tracking Patches in Video
Guangting Wang
Yizhou Zhou
Chong Luo
Wenxuan Xie
Wenjun Zeng
Zhiwei Xiong
SSL
82
24
0
06 May 2021
PLSM: A Parallelized Liquid State Machine for Unintentional Action Detection
Dipayan Das
Saumik Bhattacharya
Umapada Pal
S. Chanda
80
8
0
06 May 2021
Motion-Augmented Self-Training for Video Recognition at Smaller Scale
Kirill Gavrilyuk
Mihir Jain
I. Karmanov
Cees G. M. Snoek
71
21
0
04 May 2021
Where and When: Space-Time Attention for Audio-Visual Explanations
Yanbei Chen
Thomas Hummel
A. Sophia Koepke
Zeynep Akata
54
3
0
04 May 2021
Prediction of clinical tremor severity using Rank Consistent Ordinal Regression
Li Zhang
V. Yadav
V. Koesmahargyo
A. Abbas
I. Galatzer-Levy
31
0
0
03 May 2021
Unsupervised Discriminative Embedding for Sub-Action Learning in Complex Activities
S. Swetha
Hilde Kuehne
Yogesh S Rawat
M. Shah
80
16
0
30 Apr 2021
BiCnet-TKS: Learning Efficient Spatial-Temporal Representation for Video Person Re-Identification
Rui Hou
Hong Chang
Bingpeng Ma
Rui Huang
Shiguang Shan
82
88
0
30 Apr 2021
Action Unit Memory Network for Weakly Supervised Temporal Action Localization
Wang Luo
Tianzhu Zhang
Wenfei Yang
Jingen Liu
Tao Mei
Feng Wu
Yongdong Zhang
91
82
0
29 Apr 2021
Learning Synergistic Attention for Light Field Salient Object Detection
Y. Zhang
Geng Chen
Qian Chen
Yujia Sun
Yong Xia
Olivier Déforges
W. Hamidouche
Lu Zhang
117
24
0
28 Apr 2021
Sign Segmentation with Changepoint-Modulated Pseudo-Labelling
Katrin Renz
N. Stache
Neil Fox
Gül Varol
Samuel Albanie
82
18
0
28 Apr 2021
Medical Transformer: Universal Brain Encoder for 3D MRI Analysis
E. Jun
Seungwoo Jeong
Da-Woon Heo
Heung-Il Suk
ViT
MedIm
99
43
0
28 Apr 2021
Revisiting Skeleton-based Action Recognition
Haodong Duan
Yue Zhao
Kai-xiang Chen
Dahua Lin
Bo Dai
3DH
104
504
0
28 Apr 2021
FrameExit: Conditional Early Exiting for Efficient Video Recognition
Amir Ghodrati
B. Bejnordi
A. Habibian
147
81
0
27 Apr 2021
Three-stream network for enriched Action Recognition
Ivaxi Sheth
39
4
0
27 Apr 2021
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Brian Chen
Andrew Rouditchenko
Kevin Duarte
Hilde Kuehne
Samuel Thomas
...
Rogerio Feris
David Harwath
James R. Glass
M. Picheny
Shih-Fu Chang
SSL
83
92
0
26 Apr 2021
Temp-Frustum Net: 3D Object Detection with Temporal Fusion
Emecc Erccelik
Ekim Yurtsever
Alois C. Knoll
3DPC
58
6
0
25 Apr 2021
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
273
199
0
23 Apr 2021
Supervised Video Summarization via Multiple Feature Sets with Parallel Attention
J. Ghauri
Sherzod Hakimov
Ralph Ewerth
71
48
0
23 Apr 2021
Modeling long-term interactions to enhance action recognition
Alejandro Cartas
Petia Radeva
Mariella Dimiccoli
EgoV
51
6
0
23 Apr 2021
SportsCap: Monocular 3D Human Motion Capture and Fine-grained Understanding in Challenging Sports Videos
Xin Chen
Anqi Pang
Wei Yang
Yuexin Ma
Lan Xu
Jingyi Yu
240
59
0
23 Apr 2021
Low Pass Filter for Anti-aliasing in Temporal Action Localization
Cece Jin
Yuanqi Chen
Ge Li
Tao Zhang
Thomas H. Li
51
1
0
23 Apr 2021
Multiscale Vision Transformers
Haoqi Fan
Bo Xiong
K. Mangalam
Yanghao Li
Zhicheng Yan
Jitendra Malik
Christoph Feichtenhofer
ViT
146
1,274
0
22 Apr 2021
H2O: Two Hands Manipulating Objects for First Person Interaction Recognition
Taein Kwon
Bugra Tekin
Jan Stühmer
Federica Bogo
Marc Pollefeys
EgoV
118
184
0
22 Apr 2021
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
Hassan Akbari
Liangzhe Yuan
Rui Qian
Wei-Hong Chuang
Shih-Fu Chang
Huayu Chen
Boqing Gong
ViT
375
594
0
22 Apr 2021
Distilling Audio-Visual Knowledge by Compositional Contrastive Learning
Yanbei Chen
Yongqin Xian
A. Sophia Koepke
Ying Shan
Zeynep Akata
149
83
0
22 Apr 2021
Evaluating the Immediate Applicability of Pose Estimation for Sign Language Recognition
Amit Moryossef
Ioannis Tsochantaridis
Joe Dinn
Necati Cihan Camgöz
Richard Bowden
Tao Jiang
Annette Rios Gonzales
Mathias Müller
Sarah Ebling
SLR
63
54
0
20 Apr 2021
MGSampler: An Explainable Sampling Strategy for Video Action Recognition
Yuan Zhi
Zhan Tong
Limin Wang
Gangshan Wu
TTA
64
74
0
20 Apr 2021
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection
Junke Wang
Zuxuan Wu
Wenhao Ouyang
Xintong Han
Jingjing Chen
Ser-Nam Lim
Yu-Gang Jiang
ViT
194
277
0
20 Apr 2021
HCMS: Hierarchical and Conditional Modality Selection for Efficient Video Recognition
Zejia Weng
Zuxuan Wu
Hengduo Li
Jingjing Chen
Yu-Gang Jiang
80
4
0
20 Apr 2021
Previous
1
2
3
...
47
48
49
...
71
72
73
Next