Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.06950
Cited By
The Kinetics Human Action Video Dataset
19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Kinetics Human Action Video Dataset"
50 / 2,017 papers shown
Title
Fully Transformer-Equipped Architecture for End-to-End Referring Video Object Segmentation
P. Li
Yu Zhang
L. Yuan
Xianghua Xu
VOS
29
6
0
21 Sep 2023
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Yuan Tseng
Layne Berry
Yi-Ting Chen
I-Hsiang Chiu
Hsuan-Hao Lin
...
Yu Tsao
Shinji Watanabe
Abdel-rahman Mohamed
Chi-Luen Feng
Hung-yi Lee
VLM
SSL
66
14
0
19 Sep 2023
Collaborative Three-Stream Transformers for Video Captioning
Hao Wang
Libo Zhang
Hengrui Fan
Tiejian Luo
41
6
0
18 Sep 2023
AV-MaskEnhancer: Enhancing Video Representations through Audio-Visual Masked Autoencoder
Xingjian Diao
Ming Cheng
Shitong Cheng
VGen
32
8
0
15 Sep 2023
Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval
Rui Deng
Qian Wu
Yuke Li
Haoran Fu
26
2
0
15 Sep 2023
Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Zhiwu Qing
Shiwei Zhang
Ziyuan Huang
Yingya Zhang
Changxin Gao
Deli Zhao
Nong Sang
37
18
0
14 Sep 2023
Generative Image Dynamics
Zhengqi Li
Richard Tucker
Noah Snavely
Aleksander Holynski
DiffM
48
63
0
14 Sep 2023
STUPD: A Synthetic Dataset for Spatial and Temporal Relation Reasoning
Palaash Agrawal
Haidi Azaman
Cheston Tan
56
3
0
13 Sep 2023
Enhancing multimodal cooperation via sample-level modality valuation
Yake Wei
Ruoxuan Feng
Zihe Wang
Di Hu
38
11
0
12 Sep 2023
JOADAA: joint online action detection and action anticipation
Mohammed Guermal
François Brémond
Rui Dai
Abid Ali
37
6
0
12 Sep 2023
Can we predict the Most Replayed data of video streaming platforms?
Alessandro Duico
Ombretta Strafforello
Jan van Gemert
24
1
0
12 Sep 2023
SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-supervised Skeleton-based Action Recognition
Cong Wu
Xiaojun Wu
Josef Kittler
Tianyang Xu
Sara Atito
Muhammad Awais
Zhenhua Feng
43
3
0
11 Sep 2023
Multimodal Fish Feeding Intensity Assessment in Aquaculture
Meng Cui
Xubo Liu
Haohe Liu
Zhuangzhuang Du
Tao Chen
Guoping Lian
Daoliang Li
Wenwu Wang
34
5
0
10 Sep 2023
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion
Yujin Jeong
Won-Wha Ryoo
Seunghyun Lee
Dabin Seo
Wonmin Byeon
Sangpil Kim
Jinkyu Kim
DiffM
32
29
0
08 Sep 2023
CDFSL-V: Cross-Domain Few-Shot Learning for Videos
Sarinda Samarasinghe
Mamshad Nayeem Rizve
Navid Kardan
M. Shah
27
11
0
07 Sep 2023
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding
Yue Xu
Yong-Lu Li
Zhemin Huang
Michael Xu Liu
Cewu Lu
Yu-Wing Tai
Chi-Keung Tang
EgoV
33
9
0
05 Sep 2023
AAN: Attributes-Aware Network for Temporal Action Detection
Rui Dai
Srijan Das
Michael S. Ryoo
François Brémond
32
4
0
01 Sep 2023
Towards Contrastive Learning in Music Video Domain
Karel Veldkamp
Mariya Hendriksen
Zoltán Szlávik
Alexander Keijser
SSL
37
2
0
01 Sep 2023
RePo: Resilient Model-Based Reinforcement Learning by Regularizing Posterior Predictability
Chuning Zhu
Max Simchowitz
Siri Gadipudi
Abhishek Gupta
46
13
0
31 Aug 2023
CEFHRI: A Communication Efficient Federated Learning Framework for Recognizing Industrial Human-Robot Interaction
Umar Khalid
Hasan Iqbal
Saeed Vahidian
Jing Hua
Chong Chen
21
3
0
29 Aug 2023
Evaluation of Key Spatiotemporal Learners for Print Track Anomaly Classification Using Melt Pool Image Streams
Lynn Cherif
Mutahar Safdar
Guy Lamouche
P. Wanjara
P. Paul
G. Wood
Max Zimmermann
F. Hannesen
Yao Zhao
36
1
0
28 Aug 2023
Learning to Read Analog Gauges from Synthetic Data
Juan Carlos León Alcázar
Yazeed Alnumay
Cheng Zheng
Hassane Trigui
Sahejad Patel
Guohao Li
11
3
0
28 Aug 2023
Improving Video Violence Recognition with Human Interaction Learning on 3D Skeleton Point Clouds
Yukun Su
Guosheng Lin
Qingyao Wu
3DH
3DPC
29
3
0
26 Aug 2023
Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers
Matthew Dutson
Yin Li
M. Gupta
ViT
45
8
0
25 Aug 2023
Motion-Guided Masking for Spatiotemporal Representation Learning
D. Fan
Jue Wang
Shuai Liao
Yi Zhu
Vimal Bhat
H. Santos-Villalobos
M. Rohith
Xinyu Li
VGen
37
19
0
24 Aug 2023
An All Deep System for Badminton Game Analysis
Po-Yung Chou
Yu-Chun Lo
Bo Xie
Chu-Hsing Lin
Yu-Yung Kao
17
0
0
24 Aug 2023
NPF-200: A Multi-Modal Eye Fixation Dataset and Method for Non-Photorealistic Videos
Ziyuan Yang
Sucheng Ren
Zongwei Wu
Nanxuan Zhao
Junle Wang
Jing Qin
Shengfeng He
41
2
0
23 Aug 2023
Towards Privacy-Supporting Fall Detection via Deep Unsupervised RGB2Depth Adaptation
Hejun Xiao
Kunyu Peng
Xiangsheng Huang
Alina Roitberg
Hao Li
Zhao Wang
Rainer Stiefelhagen
28
3
0
23 Aug 2023
StoryBench: A Multifaceted Benchmark for Continuous Story Visualization
Emanuele Bugliarello
Hernan Moraldo
Ruben Villegas
Mohammad Babaeizadeh
M. Saffar
Han Zhang
D. Erhan
V. Ferrari
Pieter-Jan Kindermans
P. Voigtlaender
VGen
41
10
0
22 Aug 2023
Learning from Semantic Alignment between Unpaired Multiviews for Egocentric Video Recognition
Qitong Wang
Long Zhao
Liangzhe Yuan
Ting Liu
Xi Peng
36
12
0
22 Aug 2023
Audio-Visual Class-Incremental Learning
Weiguo Pian
Shentong Mo
Yunhui Guo
Yapeng Tian
CLL
VLM
33
28
0
21 Aug 2023
TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection
Joe Fioresi
I. Dave
M. Shah
43
18
0
21 Aug 2023
Improving Continuous Sign Language Recognition with Cross-Lingual Signs
Fangyun Wei
Yutong Chen
SLR
33
28
0
21 Aug 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
44
30
0
21 Aug 2023
Self-Feedback DETR for Temporal Action Detection
Jihwan Kim
Miso Lee
Jae-Pil Heo
53
18
0
21 Aug 2023
Joint learning of images and videos with a single Vision Transformer
Shuki Shimizu
Toru Tamaki
ViT
24
0
0
21 Aug 2023
Learnt Contrastive Concept Embeddings for Sign Recognition
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
29
5
0
18 Aug 2023
Audio-Visual Glance Network for Efficient Video Recognition
Muhammad Adi Nugroho
Sangmin Woo
Sumin Lee
Changick Kim
24
5
0
18 Aug 2023
The Unreasonable Effectiveness of Large Language-Vision Models for Source-free Video Domain Adaptation
Giacomo Zara
Alessandro Conti
Subhankar Roy
Stéphane Lathuilière
Paolo Rota
Elisa Ricci
33
11
0
17 Aug 2023
Memory-and-Anticipation Transformer for Online Action Understanding
Jiahao Wang
Guo Chen
Yifei Huang
Liming Wang
Tong Lu
OffRL
62
37
0
15 Aug 2023
Boosting Multi-modal Model Performance with Adaptive Gradient Modulation
Hong Li
Xingyu Li
Pengbo Hu
Yinuo Lei
Chunxiao Li
Yi Zhou
49
22
0
15 Aug 2023
Temporally-Adaptive Models for Efficient Video Understanding
Ziyuan Huang
Shiwei Zhang
Liang Pan
Zhiwu Qing
Yingya Zhang
Ziwei Liu
Marcelo H. Ang
46
9
0
10 Aug 2023
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
Faegheh Sardari
A. Mustafa
Philip J. B. Jackson
A. Hilton
ViT
27
6
0
09 Aug 2023
JEDI: Joint Expert Distillation in a Semi-Supervised Multi-Dataset Student-Teacher Scenario for Video Action Recognition
L. Bicsi
B. Alexe
Radu Tudor Ionescu
Marius Leordeanu
22
2
0
09 Aug 2023
Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction
Izzeddin Teeti
Rongali Sai Bhargav
Vivek Singh
Andrew Bradley
Biplab Banerjee
Fabio Cuzzolin
19
1
0
08 Aug 2023
Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Shuangrui Ding
Peisen Zhao
Xiaopeng Zhang
Rui Qian
H. Xiong
Qi Tian
ViT
29
16
0
08 Aug 2023
A Survey on Deep Learning-based Spatio-temporal Action Detection
Peng Wang
Fanwei Zeng
Yu Qian
34
5
0
03 Aug 2023
TS-RGBD Dataset: a Novel Dataset for Theatre Scenes Description for People with Visual Impairments
Leyla Benhamida
Khadidja Delloul
S. Larabi
16
1
0
02 Aug 2023
Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment
Hongbo Liu
Ming-Kun Wu
Kun Yuan
Ming-Ting Sun
Yansong Tang
Chuanchuan Zheng
Xingsen Wen
Xiu Li
47
17
0
01 Aug 2023
Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality Assessment
Kun Yuan
Zishang Kong
Chuanchuan Zheng
Ming-Ting Sun
Xingsen Wen
ViT
32
14
0
31 Jul 2023
Previous
1
2
3
...
10
11
12
...
39
40
41
Next