Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,647 papers shown
Title
Comparing Correspondences: Video Prediction with Correspondence-wise Losses
Daniel Geng
Max Hamilton
Andrew Owens
3DH
97
16
0
19 Apr 2021
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
119
87
0
19 Apr 2021
Agent-Centric Representations for Multi-Agent Reinforcement Learning
Wenling Shang
L. Espeholt
Anton Raichuk
Tim Salimans
EgoV
55
10
0
19 Apr 2021
BM-NAS: Bilevel Multimodal Neural Architecture Search
Yihang Yin
Siyu Huang
Xiang Zhang
84
27
0
19 Apr 2021
Camera Calibration and Player Localization in SoccerNet-v2 and Investigation of their Representations for Action Spotting
A. Cioppa
Adrien Deliège
Floriane Magera
Silvio Giancola
Olivier Barnich
Guohao Li
Marc Van Droogenbroeck
75
58
0
19 Apr 2021
Metadata Normalization
Mandy Lu
Qingyu Zhao
Jiequan Zhang
K. Pohl
L. Fei-Fei
Juan Carlos Niebles
Ehsan Adeli
70
20
0
19 Apr 2021
Higher Order Recurrent Space-Time Transformer for Video Action Prediction
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
Oswald Lanz
66
9
0
17 Apr 2021
TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval
Ioana Croitoru
Simion-Vlad Bogolin
Marius Leordeanu
Hailin Jin
Andrew Zisserman
Samuel Albanie
Yang Liu
VGen
67
125
0
16 Apr 2021
Temporally smooth online action detection using cycle-consistent future anticipation
Young Hwi Kim
Seonghyeon Nam
Seon Joo Kim
OffRL
72
30
0
16 Apr 2021
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks
Hung Le
Nancy F. Chen
Guosheng Lin
MLLM
83
19
0
16 Apr 2021
Action Segmentation with Mixed Temporal Domain Adaptation
Min-Hung Chen
Baopu Li
Yingze Bao
Ghassan AlRegib
120
30
0
15 Apr 2021
Weakly Supervised Video Anomaly Detection via Center-guided Discriminative Learning
Boyang Wan
Yuming Fang
Xue Xia
Jiajie Mei
58
135
0
15 Apr 2021
Adaptive Intermediate Representations for Video Understanding
Juhana Kangaspunta
A. Piergiovanni
Rico Jonschkowski
Michael S. Ryoo
A. Angelova
51
3
0
14 Apr 2021
Temporally-Aware Feature Pooling for Action Spotting in Soccer Broadcasts
Silvio Giancola
Guohao Li
65
45
0
14 Apr 2021
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction
Wonkwang Lee
Whie Jung
Han Zhang
Ting Chen
Jing Yu Koh
Thomas E. Huang
Hyungsuk Yoon
Honglak Lee
Seunghoon Hong
57
29
0
14 Apr 2021
ADNet: Temporal Anomaly Detection in Surveillance Videos
H. Öztürk
Ahmet Burak Can
130
15
0
14 Apr 2021
Towards Extremely Compact RNNs for Video Recognition with Fully Decomposed Hierarchical Tucker Structure
Miao Yin
Siyu Liao
Xiao-Yang Liu
Xiaodong Wang
Bo Yuan
AI4TS
87
31
0
12 Apr 2021
Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads
E. Georganas
Dhiraj D. Kalamkar
Sasikanth Avancha
Menachem Adelman
Deepti Aggarwal
...
Ramanarayan Mohanty
Hans Pabst
Brian Retford
Barukh Ziv
A. Heinecke
114
18
0
12 Apr 2021
Object Priors for Classifying and Localizing Unseen Actions
Pascal Mettes
William Thong
Cees G. M. Snoek
85
21
0
10 Apr 2021
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Weiyao Wang
Matt Feiszli
Heng Wang
Du Tran
VOS
89
127
0
10 Apr 2021
Video-aided Unsupervised Grammar Induction
Songyang Zhang
Linfeng Song
Lifeng Jin
Kun Xu
Dong Yu
Jiebo Luo
63
27
0
09 Apr 2021
FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework
Santiago Castro
Ruoyao Wang
Pingxuan Huang
Ian Stewart
Oana Ignat
Nan Liu
Jonathan C. Stroud
Rada Mihalcea
AIMat
91
11
0
09 Apr 2021
TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild
Vida Adeli
Mahsa Ehsanpour
Ian Reid
Juan Carlos Niebles
Silvio Savarese
Ehsan Adeli
Hamid Rezatofighi
76
61
0
08 Apr 2021
Few-Shot Action Recognition with Compromised Metric via Optimal Transport
Su Lu
Han-Jia Ye
De-Chuan Zhan
90
18
0
08 Apr 2021
Progressive Temporal Feature Alignment Network for Video Inpainting
Xueyan Zou
Linjie Yang
Ding Liu
Yong Jae Lee
84
57
0
08 Apr 2021
Self-Supervised Learning for Semi-Supervised Temporal Action Proposal
Xiang Wang
Shiwei Zhang
Zhiwu Qing
Yuanjie Shao
Changxin Gao
Nong Sang
74
68
0
07 Apr 2021
The SARAS Endoscopic Surgeon Action Detection (ESAD) dataset: Challenges and methods
V. Bawa
Gurkirt Singh
Francis KapingA
I. Skarga-Bandurova
Elettra Oleari
...
Li Li
Armando Stabile
Francesco Setti
R. Muradore
Fabio Cuzzolin
61
41
0
07 Apr 2021
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action Localization
Sanqing Qu
Guang Chen
Zhijun Li
Lijun Zhang
Fan Lu
Alois C. Knoll
102
55
0
07 Apr 2021
The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions
Jennifer J. Sun
Tomomi Karigo
Dipam Chakraborty
Sharada Mohanty
Benjamin Wild
...
Chen Chen
D. Anderson
Pietro Perona
Yisong Yue
Ann Kennedy
136
49
0
06 Apr 2021
Zeus: Efficiently Localizing Actions in Videos using Reinforcement Learning
Pramod Chunduri
J. Bang
Yao Lu
Joy Arulraj
54
12
0
06 Apr 2021
Few-Shot Transformation of Common Actions into Time and Space
Pengwan Yang
Pascal Mettes
Cees G. M. Snoek
VLM
ViT
53
10
0
06 Apr 2021
Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization
Chen Ju
Peisen Zhao
Siheng Chen
Ya Zhang
Xiaoyun Zhang
Qi Tian
WSOL
80
20
0
06 Apr 2021
Can audio-visual integration strengthen robustness under multimodal attacks?
Yapeng Tian
Chenliang Xu
AAML
107
39
0
05 Apr 2021
MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection
Jianfeng Feng
Fa-Ting Hong
Weishi Zheng
114
251
0
04 Apr 2021
Deepfake Detection Scheme Based on Vision Transformer and Distillation
Young-Jin Heo
Y. Choi
Young-Woon Lee
Byung-Gyu Kim
ViT
74
59
0
03 Apr 2021
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Xitong Yang
Haoqi Fan
Lorenzo Torresani
L. Davis
Heng Wang
VLM
84
21
0
02 Apr 2021
On the Pitfalls of Learning with Limited Data: A Facial Expression Recognition Case Study
Miguel Rodríguez Santander
Juan Felipe Hernandez Albarracin
Adín Ramirez Rivera
68
4
0
02 Apr 2021
M3L: Language-based Video Editing via Multi-Modal Multi-Level Transformers
Tsu-Jui Fu
Xinze Wang
Scott T. Grafton
Miguel P. Eckstein
Wenjie Wang
122
9
0
02 Apr 2021
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu
Tanmay Gupta
Mark Yatskar
Ram Nevatia
Aniruddha Kembhavi
VLM
97
71
0
02 Apr 2021
UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles
Tianjiao Li
Jun Liu
Wei Emma Zhang
Yun Ni
Wenqian Wang
Zhiheng Li
AI4TS
109
192
0
02 Apr 2021
Self-supervised Video Representation Learning by Context and Motion Decoupling
Lianghua Huang
Yu Liu
Bin Wang
Pan Pan
Yinghui Xu
Rong Jin
SSL
114
51
0
02 Apr 2021
Memorability: An image-computable measure of information utility
Zoya Bylinskii
L. Goetschalckx
Anelise Newman
A. Oliva
HAI
40
19
0
01 Apr 2021
Multiview Pseudo-Labeling for Semi-supervised Learning from Video
Bo Xiong
Haoqi Fan
Kristen Grauman
Christoph Feichtenhofer
SSL
70
51
0
01 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
236
1,194
0
01 Apr 2021
Motion Guided Attention Fusion to Recognize Interactions from Videos
Tae Soo Kim
Jonathan D. Jones
Gregory Hager
43
15
0
01 Apr 2021
CUPID: Adaptive Curation of Pre-training Data for Video-and-Language Representation Learning
Luowei Zhou
Jingjing Liu
Yu Cheng
Zhe Gan
Lei Zhang
75
7
0
01 Apr 2021
Self-supervised Motion Learning from Static Images
Ziyuan Huang
Shiwei Zhang
Jianwen Jiang
Mingqian Tang
Rong Jin
M. Ang
SSL
59
29
0
01 Apr 2021
A Survey on Natural Language Video Localization
Xinfang Liu
Xiushan Nie
Zhifang Tan
Jie Guo
Yilong Yin
123
7
0
01 Apr 2021
Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling
Sian Jin
Jesus Pulido
Pascal Grosset
Jiannan Tian
Dingwen Tao
J. Ahrens
78
23
0
01 Apr 2021
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective
Jiarui Xu
Xiaolong Wang
VOS
194
95
0
31 Mar 2021
Previous
1
2
3
...
48
49
50
...
71
72
73
Next