Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.06950
Cited By
The Kinetics Human Action Video Dataset
19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Kinetics Human Action Video Dataset"
50 / 2,017 papers shown
Title
TNT: Text-Conditioned Network with Transductive Inference for Few-Shot Video Classification
Andrés Villa
Juan-Manuel Perez-Rua
Victor Escorcia
Vladimir Araujo
Juan Carlos Niebles
Alvaro Soto
27
0
0
21 Jun 2021
Video Summarization through Reinforcement Learning with a 3D Spatio-Temporal U-Net
Tianrui Liu
Qingjie Meng
Jun-Jie Huang
Athanasios Vlontzos
Daniel Rueckert
Bernhard Kainz
OffRL
AI4TS
26
70
0
19 Jun 2021
Attend What You Need: Motion-Appearance Synergistic Networks for Video Question Answering
Ahjeong Seo
Gi-Cheon Kang
J. Park
Byoung-Tak Zhang
18
53
0
19 Jun 2021
MaCLR: Motion-aware Contrastive Learning of Representations for Videos
Fanyi Xiao
Joseph Tighe
Davide Modolo
SSL
24
13
0
17 Jun 2021
Long-Short Temporal Contrastive Learning of Video Transformers
Jue Wang
Gedas Bertasius
Du Tran
Lorenzo Torresani
VLM
ViT
35
50
0
17 Jun 2021
JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection
Mahsa Ehsanpour
F. Saleh
Silvio Savarese
Ian Reid
Hamid Rezatofighi
30
42
0
16 Jun 2021
Temporal Predictive Coding For Model-Based Planning In Latent Space
Tung D. Nguyen
Rui Shu
Tu Pham
Hung Bui
Stefano Ermon
OffRL
36
57
0
14 Jun 2021
Cross-Modal Attention Consistency for Video-Audio Unsupervised Learning
Shaobo Min
Qi Dai
Hongtao Xie
Chuang Gan
Yongdong Zhang
Jingdong Wang
SSL
23
7
0
13 Jun 2021
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
36
125
0
10 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
41
276
0
09 Jun 2021
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Zhurong Xia
Mingqian Tang
Nong Sang
M. Ang
ViT
27
11
0
09 Jun 2021
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li
Jie Lei
Zhe Gan
Licheng Yu
Yen-Chun Chen
...
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
32
100
0
08 Jun 2021
Hierarchical Video Generation for Complex Data
Lluis Castrejon
Nicolas Ballas
Aaron Courville
VGen
22
4
0
04 Jun 2021
RegionViT: Regional-to-Local Attention for Vision Transformers
Chun-Fu Chen
Yikang Shen
Quanfu Fan
ViT
29
195
0
04 Jun 2021
ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency
Deng Huang
Wenhao Wu
Weiwen Hu
Xu Liu
Dongliang He
Zhihua Wu
Xiangmiao Wu
Ming Tan
Errui Ding
SSL
29
55
0
04 Jun 2021
Anticipative Video Transformer
Rohit Girdhar
Kristen Grauman
ViT
27
209
0
03 Jun 2021
Attention mechanisms and deep learning for machine vision: A survey of the state of the art
A. M. Hafiz
S. A. Parah
R. A. Bhat
28
45
0
03 Jun 2021
APES: Audiovisual Person Search in Untrimmed Video
Juan Carlos León Alcázar
Long Mai
Federico Perazzi
Joon-Young Lee
Pablo Arbeláez
Guohao Li
Fabian Caba Heilbron
36
6
0
03 Jun 2021
TSI: Temporal Saliency Integration for Video Action Recognition
Haisheng Su
Kunchang Li
Jinyuan Feng
Dongliang Wang
Weihao Gan
Wei Wu
Yu Qiao
29
4
0
02 Jun 2021
Continual 3D Convolutional Neural Networks for Real-time Processing of Videos
Lukas Hedegaard
Alexandros Iosifidis
3DPC
25
14
0
31 May 2021
Multi-Modal Semantic Inconsistency Detection in Social Media News Posts
S. McCrae
Kehan Wang
A. Zakhor
36
15
0
26 May 2021
Anticipating human actions by correlating past with the future with Jaccard similarity measures
Basura Fernando
Samitha Herath
EgoV
30
57
0
26 May 2021
DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
Wenhao Wu
Yuxiang Zhao
Yanwu Xu
Xiao Tan
Dongliang He
...
Jinxing Ye
Yingying Li
Mingde Yao
Zichao Dong
Yifeng Shi
AI4TS
30
27
0
25 May 2021
Sharing Pain: Using Pain Domain Transfer for Video Recognition of Low Grade Orthopedic Pain in Horses
Sofia Broomé
K. Ask
Maheen Rashid-Engström
Pia Haubro Andersen
Hedvig Kjellström
21
12
0
21 May 2021
See, Hear, Read: Leveraging Multimodality with Guided Attention for Abstractive Text Summarization
Yash Kumar Atri
Shraman Pramanick
Vikram Goyal
Tanmoy Chakraborty
42
33
0
20 May 2021
NExT-QA:Next Phase of Question-Answering to Explaining Temporal Actions
Junbin Xiao
Xindi Shang
Angela Yao
Tat-Seng Chua
45
448
0
18 May 2021
VPN++: Rethinking Video-Pose embeddings for understanding Activities of Daily Living
Srijan Das
Rui Dai
Di Yang
Francois Bremond
ViT
48
67
0
17 May 2021
Leveraging Semantic Scene Characteristics and Multi-Stream Convolutional Architectures in a Contextual Approach for Video-Based Visual Emotion Recognition in the Wild
Ioannis Pikoulis
P. Filntisis
Petros Maragos
34
14
0
16 May 2021
Cross-Modal Progressive Comprehension for Referring Segmentation
Si Liu
Tianrui Hui
Shaofei Huang
Yunchao Wei
Yue Liu
Guanbin Li
EgoV
VOS
28
124
0
15 May 2021
MutualNet: Adaptive ConvNet via Mutual Learning from Different Model Configurations
Taojiannan Yang
Sijie Zhu
Matías Mendieta
Pu Wang
Ravikumar Balakrishnan
Minwoo Lee
T. Han
M. Shah
Chong Chen
3DH
OOD
32
23
0
14 May 2021
REGINA - Reasoning Graph Convolutional Networks in Human Action Recognition
Bruno Degardin
Vasco Lopes
Hugo Proencca
3DH
GNN
40
10
0
14 May 2021
Contrastive Learning of Image Representations with Cross-Video Cycle-Consistency
Haiping Wu
Xiaolong Wang
SSL
30
31
0
13 May 2021
WildGait: Learning Gait Representations from Raw Surveillance Streams
Adrian Cosma
I. Radoi
CVBM
37
16
0
12 May 2021
Stochastic Image-to-Video Synthesis using cINNs
Michael Dorkenwald
Timo Milbich
A. Blattmann
Robin Rombach
Konstantinos G. Derpanis
Bjorn Ommer
DiffM
VGen
21
54
0
10 May 2021
Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Mathew Monfort
SouYoung Jin
Alexander H. Liu
David Harwath
Rogerio Feris
James Glass
Aude Oliva
22
59
0
10 May 2021
Good Practices and A Strong Baseline for Traffic Anomaly Detection
Yuxiang Zhao
Wenhao Wu
Yue He
Yingying Li
Xiao Tan
Shifeng Chen
AI4TS
17
13
0
09 May 2021
Adaptive Focus for Efficient Video Recognition
Yulin Wang
Zhaoxi Chen
Haojun Jiang
Shiji Song
Yizeng Han
Gao Huang
45
98
0
07 May 2021
Unsupervised Visual Representation Learning by Tracking Patches in Video
Guangting Wang
Yizhou Zhou
Chong Luo
Wenxuan Xie
Wenjun Zeng
Zhiwei Xiong
SSL
37
24
0
06 May 2021
PLSM: A Parallelized Liquid State Machine for Unintentional Action Detection
Dipayan Das
Saumik Bhattacharya
Umapada Pal
S. Chanda
26
8
0
06 May 2021
GODIVA: Generating Open-DomaIn Videos from nAtural Descriptions
Chenfei Wu
Lun Huang
Qianxi Zhang
Binyang Li
Lei Ji
Fan Yang
Guillermo Sapiro
Nan Duan
DiffM
VGen
32
233
0
30 Apr 2021
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning
Christoph Feichtenhofer
Haoqi Fan
Bo Xiong
Ross B. Girshick
Kaiming He
SSL
AI4TS
39
257
0
29 Apr 2021
3D Human Action Representation Learning via Cross-View Consistency Pursuit
Linguo Li
Minsi Wang
Bingbing Ni
Hang Wang
Jiancheng Yang
Wenjun Zhang
143
157
0
29 Apr 2021
Sign Segmentation with Changepoint-Modulated Pseudo-Labelling
Katrin Renz
N. Stache
Neil Fox
Gül Varol
Samuel Albanie
45
18
0
28 Apr 2021
Shot Contrastive Self-Supervised Learning for Scene Boundary Detection
Shixing Chen
Xiaohan Nie
David D. Fan
Dongqing Zhang
Vimal Bhat
Raffay Hamid
SSL
31
62
0
28 Apr 2021
FrameExit: Conditional Early Exiting for Efficient Video Recognition
Amir Ghodrati
B. Bejnordi
A. Habibian
45
81
0
27 Apr 2021
Joint Representation Learning and Novel Category Discovery on Single- and Multi-modal Data
Xu Jia
Kai Han
Yukun Zhu
Bradley Green
159
57
0
26 Apr 2021
Learning to Better Segment Objects from Unseen Classes with Unlabeled Videos
Yuming Du
Yanghua Xiao
Vincent Lepetit
33
8
0
25 Apr 2021
Supervised Video Summarization via Multiple Feature Sets with Parallel Attention
J. Ghauri
Sherzod Hakimov
Ralph Ewerth
21
45
0
23 Apr 2021
Low Pass Filter for Anti-aliasing in Temporal Action Localization
Cece Jin
Yuanqi Chen
Ge Li
Tao Zhang
Thomas H. Li
19
1
0
23 Apr 2021
Opening up Open-World Tracking
Yang Liu
Idil Esen Zulfikar
Jonathon Luiten
Achal Dave
Deva Ramanan
Bastian Leibe
Aljosa Osep
Laura Leal-Taixé
31
52
0
22 Apr 2021
Previous
1
2
3
...
26
27
28
...
39
40
41
Next