Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.06950
Cited By
The Kinetics Human Action Video Dataset
19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Kinetics Human Action Video Dataset"
50 / 2,015 papers shown
Title
TAVGBench: Benchmarking Text to Audible-Video Generation
Yuxin Mao
Xuyang Shen
Jing Zhang
Zhen Qin
Jinxing Zhou
Mochu Xiang
Yiran Zhong
Yuchao Dai
48
11
0
22 Apr 2024
STAT: Towards Generalizable Temporal Action Localization
Yangcen Liu
Ziyi Liu
Yuanhao Zhai
Wen Li
David Doerman
Junsong Yuan
36
2
0
20 Apr 2024
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
Tianyuan Zhang
Hong-Xing Yu
Rundi Wu
Brandon Yushan Feng
Changxi Zheng
Noah Snavely
Jiajun Wu
William T. Freeman
AI4CE
VGen
82
62
0
19 Apr 2024
Aligning Actions and Walking to LLM-Generated Textual Descriptions
Radu Chivereanu
Adrian Cosma
Andy Catruna
R. Rughinis
I. Radoi
57
2
0
18 Apr 2024
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Xunsong Li
Pengzhan Sun
Yangcen Liu
Lixin Duan
Wen Li
43
3
0
18 Apr 2024
Multi-Task Multi-Modal Self-Supervised Learning for Facial Expression Recognition
Marah Halawa
Florian Blume
Pia Bideau
Martin Maier
Rasha Abdel Rahman
Olaf Hellwich
CVBM
36
1
0
16 Apr 2024
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
Amir Bar
Arya Bakhtiar
Danny Tran
Antonio Loquercio
Jathushan Rajasegaran
Yann LeCun
Amir Globerson
Trevor Darrell
EgoV
41
4
0
15 Apr 2024
Design and Analysis of Efficient Attention in Transformers for Social Group Activity Recognition
Masato Tamura
29
2
0
15 Apr 2024
Learning Tracking Representations from Single Point Annotations
Qiangqiang Wu
Antoni B. Chan
33
1
0
15 Apr 2024
Leveraging Temporal Contextualization for Video Action Recognition
Minji Kim
Dongyoon Han
Taekyung Kim
Bohyung Han
51
2
0
15 Apr 2024
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection
Jin Yang
Ping Wei
Huan Li
Ziyang Ren
51
8
0
14 Apr 2024
Exploring Explainability in Video Action Recognition
Avinab Saha
Shashank Gupta
S. Ankireddy
Karl Chahine
Joydeep Ghosh
32
0
0
13 Apr 2024
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
Otto Brookes
Majid Mirmehdi
H. Kühl
T. Burghardt
35
3
0
13 Apr 2024
Multimodal Attack Detection for Action Recognition Models
Furkan Mumcu
Yasin Yılmaz
AAML
33
1
0
13 Apr 2024
ActNetFormer: Transformer-ResNet Hybrid Method for Semi-Supervised Action Recognition in Videos
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
ViT
45
2
0
09 Apr 2024
TIM: A Time Interval Machine for Audio-Visual Action Recognition
Jacob Chalk
Jaesung Huh
Evangelos Kazakos
Andrew Zisserman
Dima Damen
46
9
0
08 Apr 2024
UniMD: Towards Unifying Moment Retrieval and Temporal Action Detection
Yingsen Zeng
Yujie Zhong
Chengjian Feng
Lin Ma
63
7
0
07 Apr 2024
Study of the effect of Sharpness on Blind Video Quality Assessment
Anantha Prabhu
David Pratap
Narayana Darapeni
R. AnweshP
28
0
0
06 Apr 2024
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos
Tao Wu
Runyu He
Gangshan Wu
Limin Wang
3DH
54
3
0
06 Apr 2024
Learning Correlation Structures for Vision Transformers
Manjin Kim
Paul Hongsuck Seo
Cordelia Schmid
Minsu Cho
ViT
40
7
0
05 Apr 2024
SalFoM: Dynamic Saliency Prediction with Video Foundation Models
Morteza Moradi
Mohammad Moradi
Francesco Rundo
C. Spampinato
Ali Borji
S. Palazzo
44
1
0
03 Apr 2024
SnAG: Scalable and Accurate Video Grounding
Fangzhou Mu
Sicheng Mo
Yin Li
42
8
0
02 Apr 2024
Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model
Xu He
Qiaochu Huang
Zhensong Zhang
Zhiwei Lin
Zhiyong Wu
Sicheng Yang
Minglei Li
Zhiyi Chen
Songcen Xu
Xiaofei Wu
35
15
0
02 Apr 2024
360+x: A Panoptic Multi-modal Scene Understanding Dataset
Hao Chen
Yuqi Hou
Chenyuan Qu
Irene Testini
Xiaohan Hong
Jianbo Jiao
31
7
0
01 Apr 2024
ST-LLM: Large Language Models Are Effective Temporal Learners
Ruyang Liu
Chen Li
Haoran Tang
Yixiao Ge
Ying Shan
Ge Li
48
70
0
30 Mar 2024
X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization
Anna Kukleva
Fadime Sener
Edoardo Remelli
Bugra Tekin
Eric Sauser
Bernt Schiele
Shugao Ma
VLM
EgoV
45
1
0
28 Mar 2024
Frame by Familiar Frame: Understanding Replication in Video Diffusion Models
Aimon Rahman
Malsha V. Perera
Vishal M. Patel
VGen
51
7
0
28 Mar 2024
PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization
Edward Fish
Jon Weinbren
Andrew Gilbert
49
1
0
27 Mar 2024
OmniVid: A Generative Framework for Universal Video Understanding
Junke Wang
Dongdong Chen
Chong Luo
Bo He
Lu Yuan
Zuxuan Wu
Yu-Gang Jiang
VLM
VGen
77
14
0
26 Mar 2024
Efficient Image Pre-Training with Siamese Cropped Masked Autoencoders
Alexandre Eymaël
Renaud Vandeghen
A. Cioppa
Silvio Giancola
Guohao Li
Marc Van Droogenbroeck
ViT
48
6
0
26 Mar 2024
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference
Tanvir Mahmud
Burhaneddin Yaman
Chun-Hao Liu
Diana Marculescu
38
2
0
24 Mar 2024
Edit3K: Universal Representation Learning for Video Editing Components
Xin Gu
Libo Zhang
Fan Chen
Longyin Wen
Yufei Wang
Tiejian Luo
Sijie Zhu
43
4
0
24 Mar 2024
Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey
Zeyu Han
Chao Gao
Jinyang Liu
Jeff Zhang
Sai Qian Zhang
150
318
0
21 Mar 2024
MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection
Jakub Micorek
Horst Possegger
Dominik Narnhofer
Horst Bischof
Mateusz Koziñski
18
6
0
21 Mar 2024
Selective, Interpretable, and Motion Consistent Privacy Attribute Obfuscation for Action Recognition
Filip Ilic
Henghui Zhao
Thomas Pock
Richard P. Wildes
PICV
AAML
44
2
0
19 Mar 2024
Dynamic Spatial-Temporal Aggregation for Skeleton-Aware Sign Language Recognition
Lianyu Hu
Liqing Gao
Zekang Liu
Wei Feng
SLR
40
1
0
19 Mar 2024
VideoBadminton: A Video Dataset for Badminton Action Recognition
Qi Li
Tzu-Chen Chiu
Hsiang-Wei Huang
Minmin Sun
Wei-Shinn Ku
34
3
0
19 Mar 2024
CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner
Tingbing Yan
Wenzheng Zeng
Yang Xiao
Xingyu Tong
Bo Tan
Zhiwen Fang
Zhiguo Cao
Qiufeng Wang
36
5
0
15 Mar 2024
AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
Yucen Wang
Shenghua Wan
Le Gan
Shuai Feng
De-Chuan Zhan
VGen
27
4
0
15 Mar 2024
Generalized Predictive Model for Autonomous Driving
Jiazhi Yang
Shenyuan Gao
Yihang Qiu
Li Chen
Tianyu Li
...
Ping Luo
Jun Zhang
Andreas Geiger
Yu Qiao
Hongyang Li
VGen
73
57
0
14 Mar 2024
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Guo Chen
Yifei Huang
Jilan Xu
Baoqi Pei
Zhe Chen
Zhiqi Li
Jiahao Wang
Kunchang Li
Tong Lu
Limin Wang
Mamba
64
73
0
14 Mar 2024
SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition
Jeonghyeok Do
Munchurl Kim
ViT
43
18
0
14 Mar 2024
Don't Judge by the Look: Towards Motion Coherent Video Representation
Yitian Zhang
Yue Bai
Huan Wang
Yizhou Wang
Yun Fu
35
0
0
14 Mar 2024
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Jongsuk Kim
Hyeongkeun Lee
Kyeongha Rho
Junmo Kim
Joon Son Chung
34
4
0
14 Mar 2024
Spatiotemporal Representation Learning for Short and Long Medical Image Time Series
Chengzhi Shen
M. Menten
Hrvoje Bogunović
U. Schmidt-Erfurth
H. Scholl
S. Sivaprasad
A. Lotery
Daniel Rueckert
Paul Hager
Robbie Holland
40
2
0
12 Mar 2024
VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li
Xinhao Li
Yi Wang
Yinan He
Yali Wang
Limin Wang
Yu Qiao
Mamba
37
182
0
11 Mar 2024
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Tunç Alkanat
Erkut Akdag
Egor Bondarev
Peter H. N. de With
38
4
0
11 Mar 2024
Transformer-based Fusion of 2D-pose and Spatio-temporal Embeddings for Distracted Driver Action Recognition
Erkut Akdag
Zeqi Zhu
Egor Bondarev
Peter H. N. de With
ViT
37
5
0
11 Mar 2024
Benchmarking Micro-action Recognition: Dataset, Methods, and Applications
Dan Guo
Kun Li
Bin Hu
Yan Zhang
Meng Wang
62
38
0
08 Mar 2024
A spatiotemporal style transfer algorithm for dynamic visual stimulus generation
Antonino Greco
Markus Siegel
25
2
0
07 Mar 2024
Previous
1
2
3
...
6
7
8
...
39
40
41
Next