Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 1,478 papers shown
Title
Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization
Chen Ju
Peisen Zhao
Siheng Chen
Ya Zhang
Xiaoyun Zhang
Qi Tian
WSOL
44
19
0
06 Apr 2021
Can audio-visual integration strengthen robustness under multimodal attacks?
Yapeng Tian
Chenliang Xu
AAML
38
37
0
05 Apr 2021
MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection
Jianfeng Feng
Fa-Ting Hong
Weishi Zheng
33
240
0
04 Apr 2021
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin Heo
Y. Choi
Young-Woon Lee
Byung-Gyu Kim
ViT
17
55
0
03 Apr 2021
On the Pitfalls of Learning with Limited Data: A Facial Expression Recognition Case Study
Miguel Rodríguez Santander
Juan Felipe Hernandez Albarracin
Adín Ramirez Rivera
29
4
0
02 Apr 2021
UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles
Tianjiao Li
Jun Liu
Wei Emma Zhang
Yun Ni
Wenqian Wang
Zhiheng Li
AI4TS
33
188
0
02 Apr 2021
Multiview Pseudo-Labeling for Semi-supervised Learning from Video
Bo Xiong
Haoqi Fan
Kristen Grauman
Christoph Feichtenhofer
SSL
29
49
0
01 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
57
1,134
0
01 Apr 2021
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
36
7
0
01 Apr 2021
Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling
Sian Jin
Jesus Pulido
Pascal Grosset
Jiannan Tian
Dingwen Tao
J. Ahrens
33
22
0
01 Apr 2021
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective
Jiarui Xu
Xiaolong Wang
VOS
40
92
0
31 Mar 2021
Learning by Aligning Videos in Time
S. Haresh
Sateesh Kumar
Huseyin Coskun
S. N. Syed
Andrey Konin
M. Zia
Quoc-Huy Tran
AI4TS
29
64
0
31 Mar 2021
Broaden Your Views for Self-Supervised Video Learning
Adrià Recasens
Pauline Luc
Jean-Baptiste Alayrac
Luyu Wang
Ross Hemsley
...
Florent Altché
M. Valko
Jean-Bastien Grill
Aaron van den Oord
Andrew Zisserman
SSL
AI4TS
33
127
0
30 Mar 2021
Read and Attend: Temporal Localisation in Sign Language Videos
Gül Varol
Liliane Momeni
Samuel Albanie
Triantafyllos Afouras
Andrew Zisserman
SLR
24
40
0
30 Mar 2021
Face Forensics in the Wild
Tianfei Zhou
Wenguan Wang
Zhiyuan Liang
Jianbing Shen
CVBM
46
118
0
30 Mar 2021
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
30
2,098
0
29 Mar 2021
Busy-Quiet Video Disentangling for Video Classification
Guoxi Huang
A. Bors
28
6
0
29 Mar 2021
SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events
Li Xu
He Huang
Jun Liu
ViT
LRM
17
83
0
29 Mar 2021
No frame left behind: Full Video Action Recognition
X. Liu
S. Pintea
F. Karimi Nejadasl
Olaf Booij
Jan van Gemert
21
41
0
29 Mar 2021
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization
Mengmeng Xu
Juan-Manuel Perez-Rua
Xiatian Zhu
Guohao Li
Brais Martinez
17
27
0
28 Mar 2021
Structured Co-reference Graph Attention for Video-grounded Dialogue
Junyeong Kim
Sunjae Yoon
Dahyun Kim
Chang D. Yoo
26
26
0
24 Mar 2021
The Blessings of Unlabeled Background in Untrimmed Videos
Yuan Liu
Jingyuan Chen
Zhenfang Chen
Bing Deng
Jianqiang Huang
Hanwang Zhang
CML
33
43
0
24 Mar 2021
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
Zhiwu Qing
Haisheng Su
Weihao Gan
Dongliang Wang
Wei Wu
Xiang Wang
Yu Qiao
Junjie Yan
Changxin Gao
Nong Sang
30
173
0
24 Mar 2021
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization
Chuming Lin
C. Xu
Donghao Luo
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Yanwei Fu
35
250
0
24 Mar 2021
AdaSGN: Adapting Joint Number and Model Size for Efficient Skeleton-Based Action Recognition
Lei Shi
Yifan Zhang
Jian Cheng
Hanqing Lu
30
46
0
22 Mar 2021
Computer Vision Aided URLL Communications: Proactive Service Identification and Coexistence
Muhammad Alrabeiah
Umut Demirhan
Andrew Hredzak
Ahmed Alkhateeb
11
4
0
18 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
31
33
0
18 Mar 2021
Decoupled Spatial Temporal Graphs for Generic Visual Grounding
Qi Feng
Yunchao Wei
Mingming Cheng
Yi Yang
27
5
0
18 Mar 2021
Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training
Saurabh Sahu
Palash Goyal
ViT
37
2
0
18 Mar 2021
NAS-TC: Neural Architecture Search on Temporal Convolutions for Complex Action Recognition
Pengzhen Ren
Gang Xiao
Xiaojun Chang
Yun Xiao
Zhihui Li
Xiaojiang Chen
ViT
32
4
0
17 Mar 2021
Skeleton Aware Multi-modal Sign Language Recognition
Songyao Jiang
Bin Sun
Lichen Wang
Yue Bai
Kunpeng Li
Y. Fu
SLR
33
167
0
16 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
39
165
0
11 Mar 2021
Temporal Action Segmentation from Timestamp Supervision
Zhe Li
Yazan Abu Farha
Juergen Gall
18
81
0
11 Mar 2021
ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis
Yinan He
Bei Gan
Siyu Chen
Yichun Zhou
Guojun Yin
Luchuan Song
Lu Sheng
Jing Shao
Ziwei Liu
AAML
38
130
0
09 Mar 2021
Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning
Mamshad Nayeem Rizve
Salman Khan
Fahad Shahbaz Khan
M. Shah
41
109
0
01 Mar 2021
Coarse-Fine Networks for Temporal Activity Detection in Videos
Kumara Kahatapitiya
Michael S. Ryoo
AI4TS
58
38
0
01 Mar 2021
Predicting post-operative right ventricular failure using video-based deep learning
R. Shad
Nicolas Quach
R. Fong
P. Kasinpila
C. Bowles
...
Y. Woo
J. Teuteberg
John P. Cunningham
C. Langlotz
W. Hiesinger
22
40
0
28 Feb 2021
Natural Language Video Localization: A Revisit in Span-based Question Answering Framework
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
113
84
0
26 Feb 2021
ROAD: The ROad event Awareness Dataset for Autonomous Driving
Gurkirt Singh
Stephen Akrigg
Manuele Di Maio
Valentina Fontana
Reza Javanmard Alitappeh
...
Salman Khan
S. Grazioso
Andrew Bradley
G. Gironimo
Fabio Cuzzolin
32
89
0
23 Feb 2021
Vision-Aided 6G Wireless Communications: Blockage Prediction and Proactive Handoff
Gouranga Charan
Muhammad Alrabeiah
Ahmed Alkhateeb
19
133
0
18 Feb 2021
Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries
Swathikiran Sudhakaran
Sergio Escalera
Oswald Lanz
EgoV
32
15
0
16 Feb 2021
VA-RED
2
^2
2
: Video Adaptive Redundancy Reduction
Bowen Pan
Yikang Shen
Camilo Luciano Fosco
Chung-Ching Lin
A. Andonian
Yue Meng
Kate Saenko
A. Oliva
Rogerio Feris
20
19
0
15 Feb 2021
RMS-Net: Regression and Masking for Soccer Event Spotting
Matteo Tomei
Lorenzo Baraldi
Simone Calderara
Simone Bronzin
Rita Cucchiara
45
28
0
15 Feb 2021
Win-Fail Action Recognition
Paritosh Parmar
B. Morris
29
5
0
15 Feb 2021
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling
Jie Lei
Linjie Li
Luowei Zhou
Zhe Gan
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
CLIP
46
648
0
11 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
283
1,992
0
09 Feb 2021
Privacy-preserving Cloud-based DNN Inference
Shangyu Xie
Bingyu Liu
Yuan Hong
FedML
19
6
0
07 Feb 2021
NTU-X: An Enhanced Large-scale Dataset for Improving Pose-based Recognition of Subtle Human Actions
Neel Trivedi
Anirudh Thatipelli
Ravi Kiran Sarvadevabhatla
27
18
0
27 Jan 2021
A Case Study of Deep Learning Based Multi-Modal Methods for Predicting the Age-Suitability Rating of Movie Trailers
Mahsa Shafaei
C. Smailis
I. Kakadiaris
Thamar Solorio
215
1
0
26 Jan 2021
ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning
Sangho Lee
Jiwan Chung
Youngjae Yu
Gunhee Kim
Thomas Breuel
Gal Chechik
Yale Song
71
45
0
26 Jan 2021
Previous
1
2
3
...
19
20
21
...
28
29
30
Next