v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,646 papers shown

Title
SoccerDB: A Large-Scale Database for Comprehensive Video Understanding Yudong Jiang Kaixu Cui Leilei Chen Canjin Wang Changliang Xu 53 2 0 10 Dec 2019
Flow-Distilled IP Two-Stream Networks for Compressed Video Action Recognition Shiyuan Huang Xudong Lin Svebor Karaman Shih-Fu Chang 43 10 0 10 Dec 2019
Car Pose in Context: Accurate Pose Estimation with Ground Plane Constraints Pengfei Li Weichao Qiu Michael Peven Gregory Hager Alan Yuille 3DH 34 0 0 09 Dec 2019
Video action detection by learning graph-based spatio-temporal interactions Matteo Tomei Lorenzo Baraldi Simone Calderara Simone Bronzin Rita Cucchiara 131 9 0 09 Dec 2019
Synthetic Humans for Action Recognition from Unseen Viewpoints Gül Varol Ivan Laptev Cordelia Schmid Andrew Zisserman 101 99 0 09 Dec 2019
VideoDG: Generalizing Temporal Relations in Videos to Novel Domains Zhiyu Yao Yunbo Wang Jianmin Wang Philip S. Yu Mingsheng Long OOD ViT 73 26 0 08 Dec 2019
DASZL: Dynamic Action Signatures for Zero-shot Learning Tae Soo Kim Jonathan D. Jones Michael Peven Zihao Xiao Jin Bai Yi Zhang Weichao Qiu Alan Yuille Gregory Hager 64 3 0 08 Dec 2019
Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection Sara Beery Guanhang Wu V. Rathod Ronny Votel Jonathan Huang ObjD 109 116 0 07 Dec 2019
Spatio-Temporal Pyramid Graph Convolutions for Human Action Recognition and Postural Assessment Behnoosh Parsa Athma Narayanan Behzad Dariush 3DH 81 21 0 07 Dec 2019
Generating Videos of Zero-Shot Compositions of Actions and Objects Megha Nawhal Mengyao Zhai Andreas M. Lehrmann Leonid Sigal Greg Mori 111 1 0 05 Dec 2019
Automatic Video Object Segmentation via Motion-Appearance-Stream Fusion and Instance-aware Segmentation Sung-Kwon Choo Wonkyo Seo N. Cho VOS 46 0 0 03 Dec 2019
A Context-Aware Loss Function for Action Spotting in Soccer Videos A. Cioppa Adrien Deliège Silvio Giancola Guohao Li Marc Van Droogenbroeck Rikke Gade T. Moeslund 86 81 0 03 Dec 2019
RSA: Randomized Simulation as Augmentation for Robust Human Action Recognition Yi Zhang Xinyue Wei Weichao Qiu Zihao Xiao Gregory Hager Alan Yuille 66 6 0 03 Dec 2019
BERT for Large-scale Video Segment Classification with Test-time Augmentation Tianqi Liu Qizhan Shao 71 4 0 02 Dec 2019
A Multigrid Method for Efficiently Training Video Models Chaoxia Wu Ross B. Girshick Kaiming He Christoph Feichtenhofer Philipp Krahenbuhl 91 94 0 02 Dec 2019
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation Quanfu Fan Chun-Fu Chen Hilde Kuehne Marco Pistoia David D. Cox 97 127 0 02 Dec 2019
Gate-Shift Networks for Video Action Recognition Swathikiran Sudhakaran Sergio Escalera Oswald Lanz 3DPC 97 155 0 01 Dec 2019
Exploiting Motion Information from Unlabeled Videos for Static Image Action Recognition Yiyi Zhang Li Niu Ziqi Pan Meichao Luo Jianfu Zhang Dawei Cheng Liqing Zhang 30 7 0 01 Dec 2019
Action Recognition via Pose-Based Graph Convolutional Networks with Intermediate Dense Supervision Lei Shi Yifan Zhang Jian Cheng Hanqing Lu 66 27 0 28 Nov 2019
G-TAD: Sub-Graph Localization for Temporal Action Detection Mengmeng Xu Chen Zhao D. Rojas Ali K. Thabet Guohao Li 136 437 0 26 Nov 2019
Learning Efficient Video Representation with Video Shuffle Networks Pingchuan Ma Yao Zhou Yu Lu Wayne Zhang 63 7 0 26 Nov 2019
SRG: Snippet Relatedness-based Temporal Action Proposal Generator Hyunjun Eun Sumin Lee Jinyoung Moon Jongyoul Park Chanho Jung Changick Kim 50 24 0 26 Nov 2019
Oops! Predicting Unintentional Action in Video Dave Epstein Boyuan Chen Carl Vondrick 115 103 0 25 Nov 2019
Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video Miao Liu Siyu Tang Yin Li James M. Rehg EgoV 84 21 0 25 Nov 2019
Deep Image-to-Video Adaptation and Fusion Networks for Action Recognition Yang Liu Zhaoyang Lu Jing Li Tao Yang Chao Yao 92 51 0 25 Nov 2019
Zero-Shot Imitating Collaborative Manipulation Plans from YouTube Cooking Videos Hejia Zhang Jie Zhong Stefanos Nikolaidis LM&Ro 414 1 0 25 Nov 2019
Reinventing 2D Convolutions for 3D Images Jiancheng Yang Xiaoyang Huang Yi He Jingwei Xu Canqian Yang Guozheng Xu Bingbing Ni 109 11 0 24 Nov 2019
Characterizing the impact of using features extracted from pre-trained models on the quality of video captioning sequence-to-sequence models Menatallh Hammad May Hammad Mohamed Elshenawy 33 2 0 22 Nov 2019
Background Suppression Network for Weakly-supervised Temporal Action Localization Pilhyeon Lee Youngjung Uh H. Byun 154 214 0 22 Nov 2019
Third-Person Visual Imitation Learning via Decoupled Hierarchical Controller Pratyusha Sharma Deepak Pathak Abhinav Gupta SSL 96 120 0 21 Nov 2019
TEINet: Towards an Efficient Architecture for Video Recognition Zhaoyang Liu Donghao Luo Yabiao Wang Limin Wang Ying Tai Chengjie Wang Jilin Li Feiyue Huang Tong Lu ViT 99 243 0 21 Nov 2019
Multi-Label Classification with Label Graph Superimposing Ya Wang Dongliang He Fu Li Xiang Long Zhichao Zhou Jinwen Ma Shilei Wen 82 170 0 21 Nov 2019
MMTM: Multimodal Transfer Module for CNN Fusion Hamid Reza Vaezi Joze Amirreza Shaban Michael L. Iuzzolino K. Koishida 113 284 0 20 Nov 2019
Cross-Class Relevance Learning for Temporal Concept Localization Junwei Ma S. Gorti M. Volkovs I. Stanevich Guangwei Yu 45 7 0 19 Nov 2019
Action Recognition Using Volumetric Motion Representations Michael Peven Gregory Hager A. Reiter 3DPC 58 0 0 19 Nov 2019
Mimic The Raw Domain: Accelerating Action Recognition in the Compressed Domain Barak Battash H. Barad Hanlin Tang Amit Bleiweiss 39 30 0 19 Nov 2019
Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention Vivien Sainte Fare Garnot Loic Landrieu S. Giordano N. Chehata 96 155 0 18 Nov 2019
The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation Junjie Huang Zheng Zhu Feng Guo Guan Huang Dalong Du 3DH 76 196 0 18 Nov 2019
You Only Watch Once: A Unified CNN Architecture for Real-Time Spatiotemporal Action Localization Okan Kopuklu Xiangyu Wei Gerhard Rigoll 108 144 0 15 Nov 2019
RWF-2000: An Open Large Scale Video Database for Violence Detection Ming Cheng Kunjing Cai Ming Li 93 144 0 14 Nov 2019
Guided Weak Supervision for Action Recognition with Scarce Data to Assess Skills of Children with Autism Prashant Pandey P. PrathoshA. Manu Kohli Joshua K. Pritchard 89 33 0 11 Nov 2019
Fast Learning of Temporal Action Proposal via Dense Boundary Generator Chuming Lin Jian Li Yabiao Wang Ying Tai Donghao Luo Zhipeng Cui Chengjie Wang Jilin Li Feiyue Huang Rongrong Ji 92 215 0 11 Nov 2019
Certified Data Removal from Machine Learning Models Chuan Guo Tom Goldstein Awni Y. Hannun Laurens van der Maaten MU 172 452 0 08 Nov 2019
Interpretable Self-Attention Temporal Reasoning for Driving Behavior Understanding Yi-Chieh Liu Yung-An Hsieh Min-Hung Chen Chao-Han Huck Yang Jesper N. Tegnér Y. Tsai 86 19 0 06 Nov 2019
Multi-Moments in Time: Learning and Interpreting Models for Multi-Action Video Understanding Mathew Monfort Bowen Pan K. Ramakrishnan A. Andonian Barry A. McNamara A. Lascelles Quanfu Fan Dan Gutfreund Rogerio Feris A. Oliva VLM 111 68 0 01 Nov 2019
Low-Rank HOCA: Efficient High-Order Cross-Modal Attention for Video Captioning Tao Jin Siyu Huang Yingming Li Zhongfei Zhang 79 20 0 01 Nov 2019
Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos Yitian Yuan Lin Ma Jingwen Wang Wei Liu Wenwu Zhu 113 244 0 31 Oct 2019
A Self Validation Network for Object-Level Human Attention Estimation Zehua Zhang Chen Yu David J. Crandall EgoV 95 10 0 31 Oct 2019
Comprehensive Video Understanding: Video summarization with content-based video recommender design Yudong Jiang Kaixu Cui B. Peng Changliang Xu BDL 58 28 0 30 Oct 2019
Skip-Clip: Self-Supervised Spatiotemporal Representation Learning by Future Clip Order Ranking Alaaeldin El-Nouby Shuangfei Zhai Graham W. Taylor J. Susskind AI4TS SSL CLIP 52 15 0 28 Oct 2019