ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXivPDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,017 papers shown
Title
H2O: Two Hands Manipulating Objects for First Person Interaction
  Recognition
H2O: Two Hands Manipulating Objects for First Person Interaction Recognition
Taein Kwon
Bugra Tekin
Jan Stühmer
Federica Bogo
Marc Pollefeys
EgoV
37
169
0
22 Apr 2021
ImageNet-21K Pretraining for the Masses
ImageNet-21K Pretraining for the Masses
T. Ridnik
Emanuel Ben-Baruch
Asaf Noy
Lihi Zelnik-Manor
SSeg
VLM
CLIP
192
690
0
22 Apr 2021
A cappella: Audio-visual Singing Voice Separation
A cappella: Audio-visual Singing Voice Separation
Juan F. Montesinos
V. S. Kadandale
G. Haro
40
16
0
20 Apr 2021
Data-driven vehicle speed detection from synthetic driving simulator
  images
Data-driven vehicle speed detection from synthetic driving simulator images
A. Martínez
Javier Díaz
Iván García Daza
David Fernández Llorca
28
6
0
20 Apr 2021
Temporal Query Networks for Fine-grained Video Understanding
Temporal Query Networks for Fine-grained Video Understanding
Chuhan Zhang
Ankush Gupta
Andrew Zisserman
24
83
0
19 Apr 2021
What can human minimal videos tell us about dynamic recognition models?
What can human minimal videos tell us about dynamic recognition models?
Guy Ben-Yosef
Gabriel Kreiman
S. Ullman
24
2
0
19 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal
  Pre-Training
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
30
41
0
19 Apr 2021
Ego-Exo: Transferring Visual Representations from Third-person to
  First-person Videos
Ego-Exo: Transferring Visual Representations from Third-person to First-person Videos
Yanghao Li
Tushar Nagarajan
Bo Xiong
Kristen Grauman
EgoV
53
84
0
16 Apr 2021
Adaptive Intermediate Representations for Video Understanding
Adaptive Intermediate Representations for Video Understanding
Juhana Kangaspunta
A. Piergiovanni
Rico Jonschkowski
Michael S. Ryoo
A. Angelova
26
3
0
14 Apr 2021
ADNet: Temporal Anomaly Detection in Surveillance Videos
ADNet: Temporal Anomaly Detection in Surveillance Videos
H. Öztürk
Ahmet Burak Can
27
15
0
14 Apr 2021
Unidentified Video Objects: A Benchmark for Dense, Open-World
  Segmentation
Unidentified Video Objects: A Benchmark for Dense, Open-World Segmentation
Weiyao Wang
Matt Feiszli
Heng Wang
Du Tran
VOS
15
123
0
10 Apr 2021
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal
  Action Localization
ACM-Net: Action Context Modeling Network for Weakly-Supervised Temporal Action Localization
Sanqing Qu
Guang Chen
Zhijun Li
Lijun Zhang
Fan Lu
Alois C. Knoll
17
54
0
07 Apr 2021
Contrastive Learning of Global-Local Video Representations
Contrastive Learning of Global-Local Video Representations
Shuang Ma
Zhaoyang Zeng
Daniel J. McDuff
Yale Song
SSL
32
7
0
07 Apr 2021
Strumming to the Beat: Audio-Conditioned Contrastive Video Textures
Strumming to the Beat: Audio-Conditioned Contrastive Video Textures
Medhini Narasimhan
Shiry Ginosar
Andrew Owens
Alexei A. Efros
Trevor Darrell
DiffM
21
16
0
06 Apr 2021
MIST: Multiple Instance Self-Training Framework for Video Anomaly
  Detection
MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection
Jianfeng Feng
Fa-Ting Hong
Weishi Zheng
33
240
0
04 Apr 2021
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative
  Memories
Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories
Xitong Yang
Haoqi Fan
Lorenzo Torresani
L. Davis
Heng Wang
VLM
27
20
0
02 Apr 2021
Visual Semantic Role Labeling for Video Understanding
Visual Semantic Role Labeling for Video Understanding
Arka Sadhu
Tanmay Gupta
Mark Yatskar
Ram Nevatia
Aniruddha Kembhavi
VLM
36
68
0
02 Apr 2021
TubeR: Tubelet Transformer for Video Action Detection
TubeR: Tubelet Transformer for Video Action Detection
Jiaojiao Zhao
Yanyi Zhang
Xinyu Li
Hao Chen
Shuai Bing
...
Yuanjun Xiong
Davide Modolo
I. Marsic
Cees G. M. Snoek
Joseph Tighe
ViT
36
71
0
02 Apr 2021
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval
Max Bain
Arsha Nagrani
Gül Varol
Andrew Zisserman
VGen
59
1,134
0
01 Apr 2021
Self-supervised Motion Learning from Static Images
Self-supervised Motion Learning from Static Images
Ziyuan Huang
Shiwei Zhang
Jianwen Jiang
Mingqian Tang
Rong Jin
M. Ang
SSL
26
29
0
01 Apr 2021
Rethinking Self-supervised Correspondence Learning: A Video Frame-level
  Similarity Perspective
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective
Jiarui Xu
Xiaolong Wang
VOS
40
92
0
31 Mar 2021
Learning Representational Invariances for Data-Efficient Action
  Recognition
Learning Representational Invariances for Data-Efficient Action Recognition
Yuliang Zou
Jinwoo Choi
Qitong Wang
Jia-Bin Huang
22
40
0
30 Mar 2021
Broaden Your Views for Self-Supervised Video Learning
Broaden Your Views for Self-Supervised Video Learning
Adrià Recasens
Pauline Luc
Jean-Baptiste Alayrac
Luyu Wang
Ross Hemsley
...
Florent Altché
M. Valko
Jean-Bastien Grill
Aaron van den Oord
Andrew Zisserman
SSL
AI4TS
35
127
0
30 Mar 2021
Recognizing Actions in Videos from Unseen Viewpoints
Recognizing Actions in Videos from Unseen Viewpoints
A. Piergiovanni
Michael S. Ryoo
27
25
0
30 Mar 2021
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Linbo Jin
Ben Chen
Hao Zhou
Minghui Qiu
Ling Shao
VLM
30
120
0
30 Mar 2021
PLAN-B: Predicting Likely Alternative Next Best Sequences for Action
  Prediction
PLAN-B: Predicting Likely Alternative Next Best Sequences for Action Prediction
D. Scarafoni
Irfan Essa
Thomas Ploetz
13
1
0
29 Mar 2021
Robust Audio-Visual Instance Discrimination
Robust Audio-Visual Instance Discrimination
Pedro Morgado
Ishan Misra
Nuno Vasconcelos
SSL
22
110
0
29 Mar 2021
ViViT: A Video Vision Transformer
ViViT: A Video Vision Transformer
Anurag Arnab
Mostafa Dehghani
G. Heigold
Chen Sun
Mario Lucic
Cordelia Schmid
ViT
33
2,098
0
29 Mar 2021
Unified Graph Structured Models for Video Understanding
Unified Graph Structured Models for Video Understanding
Anurag Arnab
Chen Sun
Cordelia Schmid
38
44
0
29 Mar 2021
No frame left behind: Full Video Action Recognition
No frame left behind: Full Video Action Recognition
X. Liu
S. Pintea
F. Karimi Nejadasl
Olaf Booij
Jan van Gemert
26
41
0
29 Mar 2021
Self-Attentive 3D Human Pose and Shape Estimation from Videos
Self-Attentive 3D Human Pose and Shape Estimation from Videos
Yun-Chun Chen
Marco Piccirilli
Robinson Piramuthu
Ming-Hsuan Yang
3DH
25
11
0
26 Mar 2021
GPRAR: Graph Convolutional Network based Pose Reconstruction and Action
  Recognition for Human Trajectory Prediction
GPRAR: Graph Convolutional Network based Pose Reconstruction and Action Recognition for Human Trajectory Prediction
Manh Huynh
G. Alaghband
3DH
21
2
0
25 Mar 2021
Contrasting Contrastive Self-Supervised Representation Learning
  Pipelines
Contrasting Contrastive Self-Supervised Representation Learning Pipelines
Klemen Kotar
Gabriel Ilharco
Ludwig Schmidt
Kiana Ehsani
Roozbeh Mottaghi
SSL
43
46
0
25 Mar 2021
An Image is Worth 16x16 Words, What is a Video Worth?
An Image is Worth 16x16 Words, What is a Video Worth?
Gilad Sharir
Asaf Noy
Lihi Zelnik-Manor
ViT
32
122
0
25 Mar 2021
MIcro-Surgical Anastomose Workflow recognition challenge report
MIcro-Surgical Anastomose Workflow recognition challenge report
Arnaud Huaulmé
Duygu Sarikaya
Kévin Le Mut
Fabien Despinoy
Yonghao Long
...
Pablo Arbelaez
Wolfgang Reiter
M. Mitsuishi
K. Harada
Pierre Jannin
19
34
0
24 Mar 2021
Learning Comprehensive Motion Representation for Action Recognition
Learning Comprehensive Motion Representation for Action Recognition
Mingyu Wu
Boyuan Jiang
Donghao Luo
Junchi Yan
Yabiao Wang
Ying Tai
Chengjie Wang
Jilin Li
Feiyue Huang
Xiaokang Yang
27
12
0
23 Mar 2021
MoViNets: Mobile Video Networks for Efficient Video Recognition
MoViNets: Mobile Video Networks for Efficient Video Recognition
Dan Kondratyuk
Liangzhe Yuan
Yandong Li
Li Zhang
Mingxing Tan
Matthew A. Brown
Boqing Gong
21
228
0
21 Mar 2021
PGT: A Progressive Method for Training Models on Long Videos
PGT: A Progressive Method for Training Models on Long Videos
Bo Pang
Gao Peng
Yizhuo Li
Cewu Lu
VLM
27
12
0
21 Mar 2021
Efficient Spatialtemporal Context Modeling for Action Recognition
Efficient Spatialtemporal Context Modeling for Action Recognition
Congqi Cao
Yue Lu
Yifan Zhang
Dengyang Jiang
Yanning Zhang
33
4
0
20 Mar 2021
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
MDMMT: Multidomain Multimodal Transformer for Video Retrieval
Maksim Dzabraev
M. Kalashnikov
Stepan Alekseevich Komkov
Aleksandr Petiushko
24
128
0
19 Mar 2021
Space-Time Crop & Attend: Improving Cross-modal Video Representation
  Learning
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning
Mandela Patrick
Yuki M. Asano
Bernie Huang
Ishan Misra
Florian Metze
Joao Henriques
Andrea Vedaldi
AI4TS
31
33
0
18 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Nayeon Lee
Jonathon Shlens
Barret Zoph
36
298
0
13 Mar 2021
VDSM: Unsupervised Video Disentanglement with State-Space Modeling and
  Deep Mixtures of Experts
VDSM: Unsupervised Video Disentanglement with State-Space Modeling and Deep Mixtures of Experts
M. Vowels
Necati Cihan Camgöz
Richard Bowden
CoGe
35
8
0
12 Mar 2021
ACTION-Net: Multipath Excitation for Action Recognition
ACTION-Net: Multipath Excitation for Action Recognition
Zhengwei Wang
Qi She
A. Smolic
3DPC
44
165
0
11 Mar 2021
VideoMoCo: Contrastive Video Representation Learning with Temporally
  Adversarial Examples
VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples
Tian Pan
Yibing Song
Tianyu Yang
Wenhao Jiang
Wei Liu
33
222
0
10 Mar 2021
Understanding the Robustness of Skeleton-based Action Recognition under
  Adversarial Attack
Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack
He Wang
Feixiang He
Zhexi Peng
Tianjia Shao
Yong-Liang Yang
Kun Zhou
David C. Hogg
AAML
42
39
0
09 Mar 2021
PcmNet: Position-Sensitive Context Modeling Network for Temporal Action
  Localization
PcmNet: Position-Sensitive Context Modeling Network for Temporal Action Localization
Xin Qin
Hanbin Zhao
Guangchen Lin
Hao Zeng
Songcen Xu
Xi Li
44
15
0
09 Mar 2021
PHASE: PHysically-grounded Abstract Social Events for Machine Social
  Perception
PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception
Aviv Netanyahu
Tianmin Shu
Boris Katz
Andrei Barbu
J. Tenenbaum
28
37
0
02 Mar 2021
Coarse-Fine Networks for Temporal Activity Detection in Videos
Coarse-Fine Networks for Temporal Activity Detection in Videos
Kumara Kahatapitiya
Michael S. Ryoo
AI4TS
60
38
0
01 Mar 2021
Surgical Visual Domain Adaptation: Results from the MICCAI 2020
  SurgVisDom Challenge
Surgical Visual Domain Adaptation: Results from the MICCAI 2020 SurgVisDom Challenge
Aneeq Zia
Kiran D. Bhattacharyya
Xi Liu
Ziheng Wang
S. Kondo
...
Raabid Hussain
Lena Maier-Hein
Danail Stoyanov
Stefanie Speidel
A. Jarc
49
20
0
26 Feb 2021
Previous
123...272829...394041
Next