Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.06950
Cited By
The Kinetics Human Action Video Dataset
19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Kinetics Human Action Video Dataset"
50 / 2,016 papers shown
Title
A spatiotemporal style transfer algorithm for dynamic visual stimulus generation
Antonino Greco
Markus Siegel
30
2
0
07 Mar 2024
Embodied Understanding of Driving Scenarios
Yunsong Zhou
Linyan Huang
Qingwen Bu
Jia Zeng
Tianyu Li
Hang Qiu
Hongzi Zhu
Minyi Guo
Yu Qiao
Hongyang Li
LM&Ro
62
31
0
07 Mar 2024
Fast Low-parameter Video Activity Localization in Collaborative Learning Environments
Venkatesh Jatla
Sravani Teeparthi
Ugesh Egala
Sylvia Celedón-Pattichis
Marios S. Pattichis
29
2
0
02 Mar 2024
VideoMAC: Video Masked Autoencoders Meet ConvNets
Gensheng Pei
Tao Chen
XiRuo Jiang
Huafeng Liu
Zeren Sun
Yazhou Yao
VGen
52
9
0
29 Feb 2024
Multimodal Transformer With a Low-Computational-Cost Guarantee
Sungjin Park
Edward Choi
52
1
0
23 Feb 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
50
29
0
20 Feb 2024
Revisiting Feature Prediction for Learning Visual Representations from Video
Adrien Bardes
Q. Garrido
Jean Ponce
Xinlei Chen
Michael G. Rabbat
Yann LeCun
Mahmoud Assran
Nicolas Ballas
MDE
VLM
95
75
0
15 Feb 2024
What's in the Flow? Exploiting Temporal Motion Cues for Unsupervised Generic Event Boundary Detection
Sourabh Vasant Gothe
Vibhav Agarwal
Sourav Ghosh
Jayesh Rajkumar Vachhani
Pranay Kashyap
Barath Raj Kandur
33
2
0
15 Feb 2024
Lester: rotoscope animation through video object segmentation and tracking
Ruben Tous
DiffM
VOS
41
0
0
15 Feb 2024
Towards Privacy-Aware Sign Language Translation at Scale
Phillip Rust
Bowen Shi
Skyler Wang
Necati Cihan Camgöz
Jean Maillard
SLR
47
14
0
14 Feb 2024
Advancing Human Action Recognition with Foundation Models trained on Unlabeled Public Videos
Yang Qian
Yinan Sun
A. Kargarandehkordi
Parnian Azizian
O. Mutlu
Saimourya Surabhi
Pingyi Chen
Zain Jabbar
Dennis Paul Wall
Peter Washington
OffRL
29
1
0
14 Feb 2024
Rolling Diffusion Models
David Ruhe
Jonathan Heek
Tim Salimans
Emiel Hoogeboom
DiffM
40
34
0
12 Feb 2024
BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind
Yuanyuan Mao
Xin Lin
Qin Ni
Liang He
29
3
0
12 Feb 2024
Quantifying and Enhancing Multi-modal Robustness with Modality Preference
Zequn Yang
Yake Wei
Ce Liang
Di Hu
AAML
32
9
0
09 Feb 2024
Point-VOS: Pointing Up Video Object Segmentation
Idil Esen Zulfikar
Sabarinath Mahadevan
P. Voigtlaender
Bastian Leibe
VOS
27
2
0
08 Feb 2024
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data
Shufan Li
Harkanwar Singh
Aditya Grover
Mamba
95
57
0
08 Feb 2024
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment
Lei Wang
Jun Liu
Liang Zheng
Tom Gedeon
Piotr Koniusz
37
9
0
07 Feb 2024
Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping
Qinliang Lin
Cheng Luo
Zenghao Niu
Xilin He
Weicheng Xie
Yuanbo Hou
Linlin Shen
Siyang Song
AAML
47
13
0
06 Feb 2024
VLN-Video: Utilizing Driving Videos for Outdoor Vision-and-Language Navigation
Jialu Li
Aishwarya Padmakumar
Gaurav Sukhatme
Mohit Bansal
29
6
0
05 Feb 2024
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Yang Jin
Zhicheng Sun
Kun Xu
Kun Xu
Liwei Chen
...
Yuliang Liu
Di Zhang
Yang Song
Kun Gai
Yadong Mu
VGen
55
42
0
05 Feb 2024
Taylor Videos for Action Recognition
Lei Wang
Xiuyuan Yuan
Tom Gedeon
Liang Zheng
26
6
0
05 Feb 2024
Time-, Memory- and Parameter-Efficient Visual Adaptation
Otniel-Bogdan Mercea
Alexey Gritsenko
Cordelia Schmid
Anurag Arnab
VLM
40
13
0
05 Feb 2024
Classification of Tennis Actions Using Deep Learning
Emil Hovad
Therese Hougaard-Jensen
L. H. Clemmensen
24
5
0
04 Feb 2024
Region-Based Representations Revisited
Michal Shlapentokh-Rothman
Ansel Blume
Yao Xiao
Yuqun Wu
TV Sethuraman
Heyi Tao
Jae Yong Lee
Wilfredo Torres
Yu-xiong Wang
Derek Hoiem
42
5
0
04 Feb 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Junlong Du
Yue Fan
Qing Li
Qing Li
Yuntao Du
VLM
75
77
0
03 Feb 2024
NeuroCine: Decoding Vivid Video Sequences from Human Brain Activties
Jingyuan Sun
Mingxiao Li
Zijiao Chen
Marie-Francine Moens
VGen
36
7
0
02 Feb 2024
A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
Pengyuan Zhou
Lin Wang
Zhi Liu
Yanbin Hao
Pan Hui
Sasu Tarkoma
J. Kangasharju
VGen
48
26
0
30 Jan 2024
Computer Vision for Primate Behavior Analysis in the Wild
Richard Vogg
Timo Lüddecke
Jonathan Henrich
Sharmita Dey
Matthias Nuske
...
Alexander Gail
Stefan Treue
H. Scherberger
F. Worgotter
Alexander S. Ecker
43
3
0
29 Jan 2024
MV2MAE: Multi-View Video Masked Autoencoders
Ketul Shah
Robert Crandall
Jie Xu
Peng Zhou
Marian George
Mayank Bansal
Rama Chellappa
38
4
0
29 Jan 2024
Multi-model learning by sequential reading of untrimmed videos for action recognition
Kodai Kamiya
Toru Tamaki
36
0
0
26 Jan 2024
Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
Yiyuan Zhang
Xiaohan Ding
Kaixiong Gong
Yixiao Ge
Ying Shan
Xiangyu Yue
ViT
22
7
0
25 Jan 2024
PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition
Otto Brookes
Majid Mirmehdi
Colleen Stephens
Samuel Angedakin
Katherine Corogenes
...
Klaus Zuberbühler
Christophe Boesch
M. Arandjelovic
H. Kühl
T. Burghardt
35
14
0
24 Jan 2024
Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection
Yongwei Nie
Hao Huang
Chengjiang Long
Qing Zhang
Pradipta Maji
Hongmin Cai
36
1
0
24 Jan 2024
Deep Learning for Computer Vision based Activity Recognition and Fall Detection of the Elderly: a Systematic Review
F. X. Gaya-Morey
Cristina Manresa-Yee
Jose Maria Buades Rubio
31
12
0
22 Jan 2024
ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition
Jiaming Zhou
Junwei Liang
Kun-Yu Lin
Jinrui Yang
Wei-Shi Zheng
VLM
21
8
0
22 Jan 2024
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Mengmeng Wang
Jiazheng Xing
Boyuan Jiang
Jun Chen
Jianbiao Mei
Xingxing Zuo
Guang Dai
Jingdong Wang
Yong-Jin Liu
VLM
28
4
0
22 Jan 2024
Detecting Multimedia Generated by Large AI Models: A Survey
Li Lin
Neeraj Gupta
Yue Zhang
Hainan Ren
Chun-Hao Liu
Feng Ding
Xin Wang
Xin Li
Luisa Verdoliva
Shu Hu
88
58
0
22 Jan 2024
Exploring Missing Modality in Multimodal Egocentric Datasets
Merey Ramazanova
Alejandro Pardo
Humam Alwassel
Guohao Li
EgoV
43
4
0
21 Jan 2024
Adversarial Augmentation Training Makes Action Recognition Models More Robust to Realistic Video Distribution Shifts
Kiyoon Kim
Shreyank N. Gowda
Panagiotis Eustratiadis
Antreas Antoniou
Robert B Fisher
45
2
0
21 Jan 2024
Deep Reinforcement Learning Empowered Activity-Aware Dynamic Health Monitoring Systems
Ziqiang Ye
Yulan Gao
Yue Xiao
Zehui Xiong
Dusit Niyato
18
2
0
19 Jan 2024
GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition
Guangzhao Dai
Xiangbo Shu
Wenhao Wu
Rui Yan
Jiachao Zhang
VLM
32
5
0
18 Jan 2024
Depth Over RGB: Automatic Evaluation of Open Surgery Skills Using Depth Camera
Ido Zuckerman
Nicole Werner
Jonathan Kouchly
Emma Huston
Shannon DiMarco
Paul D Dimusto
S. Laufer
34
2
0
18 Jan 2024
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
Jiu Feng
Mehmet Hamza Erol
Joon Son Chung
Arda Senocak
29
1
0
16 Jan 2024
Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding
Morteza Moradi
S. Palazzo
C. Spampinato
34
2
0
15 Jan 2024
FiGCLIP: Fine-Grained CLIP Adaptation via Densely Annotated Videos
S. DarshanSingh
Zeeshan Khan
Makarand Tapaswi
VLM
CLIP
36
3
0
15 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
67
1
0
15 Jan 2024
Hierarchical Augmentation and Distillation for Class Incremental Audio-Visual Video Recognition
Yukun Zuo
Hantao Yao
Liansheng Zhuang
Changsheng Xu
15
2
0
11 Jan 2024
HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition
Qian Wu
Ruoxuan Cui
Yuke Li
Haoqi Zhu
ViT
32
2
0
10 Jan 2024
Dr
2
^2
2
Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
Chen Zhao
Shuming Liu
K. Mangalam
Guocheng Qian
Fatimah Zohra
Abdulmohsen Alghannam
Jitendra Malik
Guohao Li
54
3
0
08 Jan 2024
Efficient Multiscale Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
41
5
0
08 Jan 2024
Previous
1
2
3
...
7
8
9
...
39
40
41
Next