ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXivPDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,016 papers shown
Title
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for
  Audio-Video Classification
Efficient Selective Audio Masked Multimodal Bottleneck Transformer for Audio-Video Classification
Wentao Zhu
40
4
0
08 Jan 2024
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion
  Recognition
MERBench: A Unified Evaluation Benchmark for Multimodal Emotion Recognition
Zheng Lian
Guoying Zhao
Yong Ren
Hao Gu
Haiyang Sun
Lan Chen
Bin Liu
Jianhua Tao
28
12
0
07 Jan 2024
Efficient Bitrate Ladder Construction using Transfer Learning and
  Spatio-Temporal Features
Efficient Bitrate Ladder Construction using Transfer Learning and Spatio-Temporal Features
A. Falahati
Mohammad Karim Safavi
Ardavan Elahi
Farhad Pakdaman
Moncef Gabbouj
AI4TS
32
1
0
06 Jan 2024
Subjective and Objective Analysis of Indian Social Media Video Quality
Subjective and Objective Analysis of Indian Social Media Video Quality
Sandeep Mishra
Mukul Jha
A. Bovik
36
0
0
05 Jan 2024
SAR-RARP50: Segmentation of surgical instrumentation and Action
  Recognition on Robot-Assisted Radical Prostatectomy Challenge
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge
Dimitrios Psychogyios
Emanuele Colleoni
Beatrice van Amsterdam
Chih-Yang Li
Shu-Yu Huang
...
Santiago Rodriguez
Juanita Puentes
Pablo Arbelaez
Omid Mohareri
Danail Stoyanov
48
24
0
31 Dec 2023
Masked Modeling for Self-supervised Representation Learning on Vision
  and Beyond
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
44
14
0
31 Dec 2023
A Large-Scale Re-identification Analysis in Sporting Scenarios: the
  Betrayal of Reaching a Critical Point
A Large-Scale Re-identification Analysis in Sporting Scenarios: the Betrayal of Reaching a Critical Point
David Freire-Obregón
J. Lorenzo-Navarro
Oliverio J. Santana
Daniel Hernández-Sosa
Modesto Castrillón-Santana
CVBM
24
1
0
29 Dec 2023
Multiscale Vision Transformers meet Bipartite Matching for efficient
  single-stage Action Localization
Multiscale Vision Transformers meet Bipartite Matching for efficient single-stage Action Localization
Ioanna Ntinou
Enrique Sanchez
Georgios Tzimiropoulos
55
4
0
29 Dec 2023
Video Understanding with Large Language Models: A Survey
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
67
84
0
29 Dec 2023
3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease
  Progression from Longitudinal OCTs
3DTINC: Time-Equivariant Non-Contrastive Learning for Predicting Disease Progression from Longitudinal OCTs
T. Emre
A. Chakravarty
Antoine Rivail
Dmitrii Lachinov
Oliver Leingang
...
S. Sivaprasad
Daniel Rueckert
A. Lotery
U. Schmidt-Erfurth
Hrvoje Bogunović
MedIm
32
3
0
28 Dec 2023
Deformable Audio Transformer for Audio Event Detection
Deformable Audio Transformer for Audio Event Detection
Wentao Zhu
28
0
0
24 Dec 2023
Classifying Soccer Ball-on-Goal Position Through Kicker Shooting Action
Classifying Soccer Ball-on-Goal Position Through Kicker Shooting Action
Javier Torón-Artiles
Daniel Hernández-Sosa
Oliverio J. Santana
J. Lorenzo-Navarro
David Freire-Obregón
24
0
0
23 Dec 2023
Video Recognition in Portrait Mode
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
47
5
0
21 Dec 2023
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
SADA: Semantic adversarial unsupervised domain adaptation for Temporal Action Localization
David Pujol-Perich
Albert Clapés
Sergio Escalera
39
0
0
20 Dec 2023
Collaborative Weakly Supervised Video Correlation Learning for
  Procedure-Aware Instructional Video Analysis
Collaborative Weakly Supervised Video Correlation Learning for Procedure-Aware Instructional Video Analysis
Tianyao He
Huabin Liu
Yuxi Li
Xiao Ma
Cheng Zhong
Yang Zhang
Weiyao Lin
31
5
0
18 Dec 2023
Traffic Incident Database with Multiple Labels Including Various
  Perspective Environmental Information
Traffic Incident Database with Multiple Labels Including Various Perspective Environmental Information
Shota Nishiyama
Takuma Saito
Ryo Nakamura
Go Ohtani
Hirokatsu Kataoka
Kensho Hara
32
0
0
17 Dec 2023
CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset
  with High-Quality Labels
CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset with High-Quality Labels
Chi-hsuan Wu
Shih-yang Liu
Xijie Huang
Xingbo Wang
Rong Zhang
Luca Minciullo
Wong Kai Yiu
Kenny Kwan
Kwang-Ting Cheng
25
1
0
14 Dec 2023
EZ-CLIP: Efficient Zeroshot Video Action Recognition
EZ-CLIP: Efficient Zeroshot Video Action Recognition
Shahzad Ahmad
S. Chanda
Yogesh S Rawat
VLM
36
7
0
13 Dec 2023
Counterfactual World Modeling for Physical Dynamics Understanding
Counterfactual World Modeling for Physical Dynamics Understanding
Rahul Venkatesh
Honglin Chen
Kevin T. Feigelis
Daniel M. Bear
Khaled Jedoui
...
Wanhee Lee
Sherry Liu
Kevin A. Smith
Judith E. Fan
Daniel L. K. Yamins
VGen
45
1
0
11 Dec 2023
A Cascaded Neural Network System For Rating Student Performance In
  Surgical Knot Tying Simulation
A Cascaded Neural Network System For Rating Student Performance In Surgical Knot Tying Simulation
Yunzhe Xue
Olanrewaju A Eletta
J. Ady
Nell M. Patel
Advaith Bongu
Usman Roshan
42
2
0
09 Dec 2023
A Review of Machine Learning Methods Applied to Video Analysis Systems
A Review of Machine Learning Methods Applied to Video Analysis Systems
Marios S. Pattichis
Venkatesh Jatla
Alvaro E. Ullao Cerna
29
3
0
08 Dec 2023
LifelongMemory: Leveraging LLMs for Answering Queries in Long-form
  Egocentric Videos
LifelongMemory: Leveraging LLMs for Answering Queries in Long-form Egocentric Videos
Ying Wang
Yanlai Yang
Mengye Ren
49
15
0
07 Dec 2023
The Potential of Vision-Language Models for Content Moderation of
  Children's Videos
The Potential of Vision-Language Models for Content Moderation of Children's Videos
Syed Hammad Ahmed
Shengnan Hu
G. Sukthankar
VLM
33
3
0
06 Dec 2023
From Detection to Action Recognition: An Edge-Based Pipeline for Robot
  Human Perception
From Detection to Action Recognition: An Edge-Based Pipeline for Robot Human Perception
Petros Toupas
Georgios Tsamis
Dimitrios Giakoumis
K. Votis
Dimitrios Tzovaras
32
0
0
06 Dec 2023
Deep Multimodal Fusion for Surgical Feedback Classification
Deep Multimodal Fusion for Surgical Feedback Classification
Rafal Kocielnik
Elyssa Y. Wong
Timothy N. Chu
Lydia Lin
De-An Huang
Jiayun Wang
A. Anandkumar
Andrew J. Hung
35
2
0
06 Dec 2023
DemaFormer: Damped Exponential Moving Average Transformer with
  Energy-Based Modeling for Temporal Language Grounding
DemaFormer: Damped Exponential Moving Average Transformer with Energy-Based Modeling for Temporal Language Grounding
Thong Nguyen
Xiaobao Wu
Xinshuai Dong
Cong-Duy Nguyen
See-Kiong Ng
Anh Tuan Luu
37
8
0
05 Dec 2023
Adapting Short-Term Transformers for Action Detection in Untrimmed
  Videos
Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
Min Yang
Huan Gao
Ping Guo
Limin Wang
ViT
36
5
0
04 Dec 2023
Hulk: A Universal Knowledge Translator for Human-Centric Tasks
Hulk: A Universal Knowledge Translator for Human-Centric Tasks
Yizhou Wang
YiXuan Wu
Shixiang Tang
Weizhen He
Xun Guo
...
Lei Bai
Rui Zhao
Jian Wu
Tong He
Wanli Ouyang
VLM
48
14
0
04 Dec 2023
Generating Action-conditioned Prompts for Open-vocabulary Video Action
  Recognition
Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition
Chengyou Jia
Minnan Luo
Xiaojun Chang
Zhuohang Dang
Mingfei Han
Mengmeng Wang
Guangwen Dai
Sizhe Dang
Jingdong Wang
VLM
34
4
0
04 Dec 2023
Towards Generalizable Zero-Shot Manipulation via Translating Human
  Interaction Plans
Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans
Homanga Bharadhwaj
Abhi Gupta
Vikash Kumar
Shubham Tulsiani
LM&Ro
40
39
0
01 Dec 2023
Just Add $π$! Pose Induced Video Transformers for Understanding
  Activities of Daily Living
Just Add πππ! Pose Induced Video Transformers for Understanding Activities of Daily Living
Dominick Reilly
Srijan Das
ViT
38
18
0
30 Nov 2023
CAST: Cross-Attention in Space and Time for Video Action Recognition
CAST: Cross-Attention in Space and Time for Video Action Recognition
Dongho Lee
Jongseo Lee
Jinwoo Choi
EgoV
35
12
0
30 Nov 2023
DEVIAS: Learning Disentangled Video Representations of Action and Scene
  for Holistic Video Understanding
DEVIAS: Learning Disentangled Video Representations of Action and Scene for Holistic Video Understanding
Kyungho Bae
Geo Ahn
Youngrae Kim
Jinwoo Choi
30
3
0
30 Nov 2023
Overcoming Label Noise for Source-free Unsupervised Video Domain
  Adaptation
Overcoming Label Noise for Source-free Unsupervised Video Domain Adaptation
A. Dasgupta
C. V. Jawahar
Karteek Alahari
TTA
VLM
26
10
0
30 Nov 2023
VBench: Comprehensive Benchmark Suite for Video Generative Models
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang
Yinan He
Jiashuo Yu
Fan Zhang
Chenyang Si
...
Xinyuan Chen
Limin Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGen
80
358
0
29 Nov 2023
GeoDeformer: Geometric Deformable Transformer for Action Recognition
GeoDeformer: Geometric Deformable Transformer for Action Recognition
Jinhui Ye
Jiaming Zhou
Hui Xiong
Junwei Liang
ViT
26
1
0
29 Nov 2023
Action-slot: Visual Action-centric Representations for Multi-label
  Atomic Activity Recognition in Traffic Scenes
Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes
Chi-Hsi Kung
Shu-Wei Lu
Yi-Hsuan Tsai
Yi-Ting Chen
37
6
0
29 Nov 2023
E-ViLM: Efficient Video-Language Model via Masked Video Modeling with
  Semantic Vector-Quantized Tokenizer
E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer
Jacob Zhiyuan Fang
Skyler Zheng
Vasu Sharma
Robinson Piramuthu
VLM
40
0
0
28 Nov 2023
End-to-End Temporal Action Detection with 1B Parameters Across 1000
  Frames
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu
Chen-Da Liu-Zhang
Chen Zhao
Guohao Li
38
25
0
28 Nov 2023
F4D: Factorized 4D Convolutional Neural Network for Efficient
  Video-level Representation Learning
F4D: Factorized 4D Convolutional Neural Network for Efficient Video-level Representation Learning
Mohammad Al-Saad
Lakshmish Ramaswamy
S. Bhandarkar
AI4TS
24
0
0
28 Nov 2023
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating
  Video-based Large Language Models
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Munan Ning
Bin Zhu
Yujia Xie
Bin Lin
Jiaxi Cui
Lu Yuan
Dongdong Chen
Li-ming Yuan
ELM
MLLM
27
58
0
27 Nov 2023
Temporal Action Localization for Inertial-based Human Activity
  Recognition
Temporal Action Localization for Inertial-based Human Activity Recognition
Marius Bock
Michael Moeller
Kristof Van Laerhoven
30
0
0
27 Nov 2023
Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation :
  A Unified Approach
Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach
Ayush K. Rai
Tarun Krishna
Feiyan Hu
Alexandru Drimbarean
Kevin McGuinness
Alan F. Smeaton
Noel E. O'Connor
47
1
0
27 Nov 2023
Side4Video: Spatial-Temporal Side Network for Memory-Efficient
  Image-to-Video Transfer Learning
Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning
Huanjin Yao
Wenhao Wu
Zhiheng Li
VLM
95
9
0
27 Nov 2023
Align before Adapt: Leveraging Entity-to-Region Alignments for
  Generalizable Video Action Recognition
Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen
Dapeng Chen
Ruijin Liu
Sai Zhou
Wenyuan Xue
Wei Peng
33
6
0
27 Nov 2023
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio,
  Video, Point Cloud, Time-Series and Image Recognition
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
Xiaohan Ding
Yiyuan Zhang
Yixiao Ge
Sijie Zhao
Lin Song
Xiangyu Yue
Ying Shan
VLM
AI4TS
SSL
29
104
0
27 Nov 2023
Mug-STAN: Adapting Image-Language Pretrained Models for General Video
  Understanding
Mug-STAN: Adapting Image-Language Pretrained Models for General Video Understanding
Ruyang Liu
Jingjia Huang
Wei-Nan Gao
Thomas H. Li
Ge Li
VLM
37
3
0
25 Nov 2023
AutoEval-Video: An Automatic Benchmark for Assessing Large Vision
  Language Models in Open-Ended Video Question Answering
AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering
Xiuyuan Chen
Yuan Lin
Yuchen Zhang
Weiran Huang
ELM
MLLM
31
26
0
25 Nov 2023
Decouple Content and Motion for Conditional Image-to-Video Generation
Decouple Content and Motion for Conditional Image-to-Video Generation
Cuifeng Shen
Yulu Gan
Chen Chen
Xiongwei Zhu
Lele Cheng
Tingting Gao
Jinzhi Wang
VGen
DiffM
33
5
0
24 Nov 2023
Previous
123...8910...394041
Next