ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXivPDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,015 papers shown
Title
Reinforcement Learning from Wild Animal Videos
Reinforcement Learning from Wild Animal Videos
Elliot Chane-Sane
Constant Roux
O. Stasse
Nicolas Mansard
217
0
0
05 Dec 2024
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand
  Audio-Visual Information?
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?
Kaixiong Gong
Kaituo Feng
Yangqiu Song
Yibing Wang
Mofan Cheng
...
Jiaming Han
Benyou Wang
Yutong Bai
Zhiyong Yang
Xiangyu Yue
MLLM
AuLLM
VLM
91
5
0
03 Dec 2024
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for
  Joint Video Highlight Detection and Moment Retrieval
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
Dhiman Paul
Md Rizwan Parvez
Nabeel Mohammed
Shafin Rahman
VGen
77
0
0
02 Dec 2024
SEAL: Semantic Attention Learning for Long Video Representation
SEAL: Semantic Attention Learning for Long Video Representation
Lan Wang
Yujia Chen
Wen-Sheng Chu
Vishnu Naresh Boddeti
Du Tran
VLM
75
0
0
02 Dec 2024
KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder
KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder
Maheswar Bora
Saurabh Atreya
Aritra Mukherjee
Abhijit Das
92
0
0
19 Nov 2024
LaVin-DiT: Large Vision Diffusion Transformer
Zhaoqing Wang
Xiaobo Xia
Runnan Chen
Dongdong Yu
Changhu Wang
Mingming Gong
Tongliang Liu
94
6
0
18 Nov 2024
TDSM: Triplet Diffusion for Skeleton-Text Matching in Zero-Shot Action Recognition
Jeonghyeok Do
Munchurl Kim
49
1
0
16 Nov 2024
DiMoDif: Discourse Modality-information Differentiation for Audio-visual Deepfake Detection and Localization
DiMoDif: Discourse Modality-information Differentiation for Audio-visual Deepfake Detection and Localization
C. Koutlis
Symeon Papadopoulos
58
2
0
15 Nov 2024
A Transformer-Based Visual Piano Transcription Algorithm
A Transformer-Based Visual Piano Transcription Algorithm
Uros Zivanovic
Carlos Eduardo Cancino-Chacón
ViT
31
0
0
13 Nov 2024
Weakly-Supervised Anomaly Detection in Surveillance Videos Based on
  Two-Stream I3D Convolution Network
Weakly-Supervised Anomaly Detection in Surveillance Videos Based on Two-Stream I3D Convolution Network
Sareh Nejad
Anwar Haque
21
1
0
13 Nov 2024
Multimodal Fusion Balancing Through Game-Theoretic Regularization
Multimodal Fusion Balancing Through Game-Theoretic Regularization
Konstantinos Kontras
Thomas Strypsteen
Christos Chatzichristos
Paul P. Liang
Matthew Blaschko
M. D. Vos
36
0
0
11 Nov 2024
Multi-Modal interpretable automatic video captioning
Multi-Modal interpretable automatic video captioning
Antoine Hanna-Asaad
Decky Aspandi
Titus Zaharia
33
0
0
11 Nov 2024
Extended multi-stream temporal-attention module for skeleton-based human
  action recognition (HAR)
Extended multi-stream temporal-attention module for skeleton-based human action recognition (HAR)
Faisal Mehmood
Xin Guo
Enqing Chen
Muhammad Azeem Akbar
A. Khan
Sami Ullah
28
4
0
10 Nov 2024
Improved Video VAE for Latent Video Diffusion Model
Improved Video VAE for Latent Video Diffusion Model
Pingyu Wu
Kai Zhu
Yu Liu
Liming Zhao
Wei-dong Zhai
Yang Cao
Zheng-jun Zha
VGen
DiffM
61
4
0
10 Nov 2024
CityGuessr: City-Level Video Geo-Localization on a Global Scale
CityGuessr: City-Level Video Geo-Localization on a Global Scale
P. Kulkarni
Gaurav Kumar Nayak
Mubarak Shah
ViT
AI4TS
29
2
0
10 Nov 2024
Don't Look Twice: Faster Video Transformers with Run-Length Tokenization
Don't Look Twice: Faster Video Transformers with Run-Length Tokenization
Rohan Choudhury
Guanglei Zhu
Sihan Liu
Koichiro Niinuma
Kris M. Kitani
László A. Jeni
31
11
0
07 Nov 2024
HourVideo: 1-Hour Video-Language Understanding
HourVideo: 1-Hour Video-Language Understanding
Keshigeyan Chandrasegaran
Agrim Gupta
Lea M. Hadzic
Taran Kota
Jimming He
Cristobal Eyzaguirre
Zane Durante
Manling Li
Jiajun Wu
L. Fei-Fei
VLM
53
34
0
07 Nov 2024
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance
Ruyang Liu
Haoran Tang
Haibo Liu
Yixiao Ge
Ying Shan
Chen Li
Jiankun Yang
VLM
53
6
0
04 Nov 2024
SPECTRUM: Semantic Processing and Emotion-informed video-Captioning
  Through Retrieval and Understanding Modalities
SPECTRUM: Semantic Processing and Emotion-informed video-Captioning Through Retrieval and Understanding Modalities
Ehsan Faghihi
Mohammedreza Zarenejad
Ali-Asghar Beheshti Shirazi
47
0
0
04 Nov 2024
Constrained Human-AI Cooperation: An Inclusive Embodied Social
  Intelligence Challenge
Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge
Weihua Du
Qiushi Lyu
Jiaming Shan
Zhenting Qi
Hongxin Zhang
...
Andi Peng
Tianmin Shu
Kwonjoon Lee
Behzad Dariush
Chuang Gan
42
1
0
04 Nov 2024
ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
Salman Khan
Izzeddin Teeti
Reza Javanmard Alitappeh
Mihaela C. Stoian
Eleonora Giunchiglia
Gurkirt Singh
Andrew Bradley
Fabio Cuzzolin
47
0
0
03 Nov 2024
STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting
  Transformer-based Video Models
STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting Transformer-based Video Models
Zerui Wang
Yan Liu
55
0
0
01 Nov 2024
Learning Video Representations without Natural Videos
Learning Video Representations without Natural Videos
Xueyang Yu
Xinlei Chen
Yossi Gandelsman
VGen
AI4TS
54
0
0
31 Oct 2024
Video Token Merging for Long-form Video Understanding
Video Token Merging for Long-form Video Understanding
Seon-Ho Lee
Jue Wang
Zhikang Zhang
D. Fan
Xinyu Li
48
5
0
31 Oct 2024
Deep Convolutional Neural Networks on Multiclass Classification of Three-Dimensional Brain Images for Parkinson's Disease Stage Prediction
Deep Convolutional Neural Networks on Multiclass Classification of Three-Dimensional Brain Images for Parkinson's Disease Stage Prediction
Guan-Hua Huang
Wan-Chen Lai
Tai-Been Chen
Chien-Chin Hsu
Huei-Yung Chen
Yi-Chen Wu
Li-Ren Yeh
MedIm
39
2
0
31 Oct 2024
EchoFM: Foundation Model for Generalizable Echocardiogram Analysis
EchoFM: Foundation Model for Generalizable Echocardiogram Analysis
Sekeun Kim
Pengfei Jin
S. Song
Cheng Chen
Yiwei Li
Hui Ren
Xiang Li
Tianming Liu
Quanzheng Li
39
0
0
30 Oct 2024
AtGCN: A Graph Convolutional Network For Ataxic Gait Detection
AtGCN: A Graph Convolutional Network For Ataxic Gait Detection
Karan Bania
Tanmay Verlekar
31
1
0
30 Oct 2024
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct
  Image Datasets
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasets
Adrian Iordache
B. Alexe
Radu Tudor Ionescu
31
1
0
29 Oct 2024
Analytic Continual Test-Time Adaptation for Multi-Modality Corruption
Analytic Continual Test-Time Adaptation for Multi-Modality Corruption
Yufei Zhang
Yicheng Xu
Hongxin Wei
Zhiping Lin
Huiping Zhuang
TTA
37
0
0
29 Oct 2024
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
USpeech: Ultrasound-Enhanced Speech with Minimal Human Effort via Cross-Modal Synthesis
Luca Jiang-Tao Yu
Running Zhao
Sijie Ji
Edith C.H. Ngai
Chenshu Wu
33
0
0
29 Oct 2024
LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group
  Activity Recognition
LiGAR: LiDAR-Guided Hierarchical Transformer for Multi-Modal Group Activity Recognition
N. V. R. Chappa
Khoa Luu
39
1
0
28 Oct 2024
BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events
BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation using RGB Frames and Events
Yijin Li
Yichen Shen
Zhaoyang Huang
Shuo Chen
Weikang Bian
...
Keqiang Sun
Hujun Bao
Zhaopeng Cui
Guofeng Zhang
Hongsheng Li
3DPC
50
5
0
27 Oct 2024
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video
  Reconstruction
NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction
Z. Gong
Guangyin Bao
Qi Zhang
Zhongwei Wan
Duoqian Miao
...
Changwei Wang
Rongtao Xu
Liang Hu
Ke Liu
Yu Zhang
DiffM
VGen
53
8
0
25 Oct 2024
Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs.
  Performance
Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs. Performance
M. Asres
Lei Jiao
C. Omlin
36
0
0
24 Oct 2024
Detecting Adversarial Examples
Detecting Adversarial Examples
Furkan Mumcu
Yasin Yilmaz
AAML
21
1
0
22 Oct 2024
AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
AlphaChimp: Tracking and Behavior Recognition of Chimpanzees
Xiaoxuan Ma
Yutang Lin
Yuan Xu
Stephan P. Kaufhold
Jack Terwilliger
Andres Meza
Yixin Zhu
Federico Rossano
Yizhou Wang
36
0
0
22 Oct 2024
Masked Differential Privacy
Masked Differential Privacy
David Schneider
Sina Sajadmanesh
Vikash Sehwag
Saquib Sarfraz
Rainer Stiefelhagen
Lingjuan Lyu
Vivek Sharma
33
0
0
22 Oct 2024
Storyboard guided Alignment for Fine-grained Video Action Recognition
Storyboard guided Alignment for Fine-grained Video Action Recognition
Enqi Liu
Liyuan Pan
Yan Yang
Yiran Zhong
Zhijing Wu
Xinxiao Wu
Liu Liu
38
0
0
18 Oct 2024
Human Action Anticipation: A Survey
Human Action Anticipation: A Survey
Bolin Lai
Sam Toyer
Tushar Nagarajan
Rohit Girdhar
S. Zha
James M. Rehg
Kris Kitani
Kristen Grauman
Ruta Desai
Miao Liu
AI4TS
41
1
0
17 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled
  Rule-based Annotations
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
39
3
0
17 Oct 2024
On-the-fly Modulation for Balanced Multimodal Learning
On-the-fly Modulation for Balanced Multimodal Learning
Yake Wei
D. Hu
Henghui Du
Zhicheng Dou
28
7
0
15 Oct 2024
MoTE: Reconciling Generalization with Specialization for Visual-Language
  to Video Knowledge Transfer
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
Minghao Zhu
Zhengpu Wang
Mengxian Hu
Ronghao Dang
Xiao Lin
Xun Zhou
Chengju Liu
Qijun Chen
45
1
0
14 Oct 2024
Make the Pertinent Salient: Task-Relevant Reconstruction for Visual
  Control with Distractions
Make the Pertinent Salient: Task-Relevant Reconstruction for Visual Control with Distractions
Kyungmin Kim
JB Lanier
Pierre Baldi
Charless C. Fowlkes
Roy Fox
33
1
0
13 Oct 2024
Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark
  for Fine-grained Motor Behavior Recognition
Human Stone Toolmaking Action Grammar (HSTAG): A Challenging Benchmark for Fine-grained Motor Behavior Recognition
Cheng Liu
Xuyang Yan
Zekun Zhang
Cheng Ding
Tianhao Zhao
Shaya Jannati
Cynthia Martinez
Dietrich Stout
33
1
0
10 Oct 2024
Evaluating Model Performance with Hard-Swish Activation Function
  Adjustments
Evaluating Model Performance with Hard-Swish Activation Function Adjustments
Sai Abhinav Pydimarry
Shekhar Madhav Khairnar
Sofia Garces Palacios
Ganesh Sankaranarayanan
Darian Hoagland
Dmitry Nepomnayshy
Huu Phong Nguyen
20
1
0
09 Oct 2024
Secure Video Quality Assessment Resisting Adversarial Attacks
Secure Video Quality Assessment Resisting Adversarial Attacks
Ao Zhang
Yu Ran
Weixuan Tang
Yuan-Gen Wang
Qingxiao Guan
Chunsheng Yang
AAML
34
0
0
09 Oct 2024
MTFL: Multi-Timescale Feature Learning for Weakly-Supervised Anomaly
  Detection in Surveillance Videos
MTFL: Multi-Timescale Feature Learning for Weakly-Supervised Anomaly Detection in Surveillance Videos
Yiling Zhang
Erkut Akdag
Egor Bondarev
Peter H. N. de With
AI4TS
ViT
26
1
0
08 Oct 2024
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark
  for Video Generation
Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation
Fanqing Meng
Jiaqi Liao
Xinyu Tan
Wenqi Shao
Quanfeng Lu
Kaipeng Zhang
Yu Cheng
Dianqi Li
Yu Qiao
Ping Luo
VGen
EGVM
32
24
0
07 Oct 2024
Bisimulation metric for Model Predictive Control
Bisimulation metric for Model Predictive Control
Yutaka Shimizu
Masayoshi Tomizuka
33
0
0
06 Oct 2024
Linear Transformer Topological Masking with Graph Random Features
Linear Transformer Topological Masking with Graph Random Features
Isaac Reid
Kumar Avinava Dubey
Deepali Jain
Will Whitney
Amr Ahmed
...
Connor Schenck
Richard E. Turner
René Wagner
Adrian Weller
Krzysztof Choromanski
27
1
0
04 Oct 2024
Previous
123456...394041
Next