ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.06950
  4. Cited By
The Kinetics Human Action Video Dataset

The Kinetics Human Action Video Dataset

19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
ArXivPDFHTML

Papers citing "The Kinetics Human Action Video Dataset"

50 / 2,017 papers shown
Title
A Real-time Action Representation with Temporal Encoding and Deep
  Compression
A Real-time Action Representation with Temporal Encoding and Deep Compression
Kun Liu
Wu Liu
Huadong Ma
Mingkui Tan
Chuang Gan
29
35
0
17 Jun 2020
Video Understanding as Machine Translation
Bruno Korbar
Fabio Petroni
Rohit Girdhar
Lorenzo Torresani
SSL
20
29
0
12 Jun 2020
ESAD: Endoscopic Surgeon Action Detection Dataset
ESAD: Endoscopic Surgeon Action Detection Dataset
V. Bawa
Gurkirt Singh
Francis KapingA
InnaSkarga-Bandurova
A. Leporini
...
Armando Stabile
Francesco Setti
R. Muradore
Elettra Oleari
Fabio Cuzzolin
19
15
0
12 Jun 2020
Understanding Human Hands in Contact at Internet Scale
Understanding Human Hands in Contact at Internet Scale
Dandan Shan
Jiaqi Geng
Michelle Shu
David Fouhey
42
319
0
11 Jun 2020
Disentangled Non-Local Neural Networks
Disentangled Non-Local Neural Networks
Minghao Yin
Zhuliang Yao
Yue Cao
Xiu Li
Zheng-Wei Zhang
Stephen Lin
Han Hu
17
327
0
11 Jun 2020
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local
  Module for Action Recognition
PNL: Efficient Long-Range Dependencies Extraction with Pyramid Non-Local Module for Action Recognition
Yuecong Xu
Haozhi Cao
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
18
5
0
09 Jun 2020
Temporal Aggregate Representations for Long-Range Video Understanding
Temporal Aggregate Representations for Long-Range Video Understanding
Fadime Sener
Dipika Singhania
Angela Yao
AI4TS
30
7
0
01 Jun 2020
Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data
Large Scale Audiovisual Learning of Sounds with Weakly Labeled Data
Haytham M. Fayek
Anurag Kumar
14
35
0
29 May 2020
A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews
A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews
Edison Marrese-Taylor
Cristian Rodriguez-Opazo
Jorge A. Balazs
Stephen Gould
Y. Matsuo
33
3
0
27 May 2020
AnimGAN: A Spatiotemporally-Conditioned Generative Adversarial Network
  for Character Animation
AnimGAN: A Spatiotemporally-Conditioned Generative Adversarial Network for Character Animation
Maryam Sadat Mirzaei
Kourosh Meshgi
Etienne Frigo
T. Nishida
GAN
27
13
0
23 May 2020
S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement
  and Data Generation
S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation
Yizhe Zhu
Martin Renqiang Min
Asim Kadav
H. Graf
CoGe
DRL
32
95
0
23 May 2020
Intra- and Inter-Action Understanding via Temporal Action Parsing
Intra- and Inter-Action Understanding via Temporal Action Parsing
Dian Shao
Yue Zhao
Bo Dai
Dahua Lin
22
71
0
20 May 2020
Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate
Anand Ramakrishnan
Brian Zylich
Erin Ottmar
Jennifer LoCasale-Crouch
Jacob Whitehill
24
25
0
19 May 2020
Building BROOK: A Multi-modal and Facial Video Database for
  Human-Vehicle Interaction Research
Building BROOK: A Multi-modal and Facial Video Database for Human-Vehicle Interaction Research
Xiangjun Peng
Zhentao Huang
Xu Sun
14
10
0
18 May 2020
Project RISE: Recognizing Industrial Smoke Emissions
Project RISE: Recognizing Industrial Smoke Emissions
Yen-Chia Hsu
Ting-Hao 'Kenneth' Huang
Ting-Yao Hu
P. Dille
Sean Prendi
Ryan N. Hoffman
Anastasia Tsuhlares
Jessica Pachuta
Randy Sargent
I. Nourbakhsh
45
19
0
13 May 2020
Compositional Few-Shot Recognition with Primitive Discovery and
  Enhancing
Compositional Few-Shot Recognition with Primitive Discovery and Enhancing
Yixiong Zou
Shanghang Zhang
Ke Chen
Yonghong Tian
Yaowei Wang
J. M. F. Moura
28
26
0
12 May 2020
Unsupervised Multi-label Dataset Generation from Web Data
Unsupervised Multi-label Dataset Generation from Web Data
Carlos Roig
David Varas
Issey Masuda
J. C. Riveiro
Elisenda Bou
13
3
0
12 May 2020
Condensed Movies: Story Based Retrieval with Contextual Embeddings
Condensed Movies: Story Based Retrieval with Contextual Embeddings
Max Bain
Arsha Nagrani
A. Brown
Andrew Zisserman
41
100
0
08 May 2020
Learning to Segment Actions from Observation and Narration
Learning to Segment Actions from Observation and Narration
Daniel Fried
Jean-Baptiste Alayrac
Phil Blunsom
Chris Dyer
S. Clark
Aida Nematzadeh
33
31
0
07 May 2020
DramaQA: Character-Centered Video Story Understanding with Hierarchical
  QA
DramaQA: Character-Centered Video Story Understanding with Hierarchical QA
Seongho Choi
Kyoung-Woon On
Y. Heo
Ahjeong Seo
Youwon Jang
Minsu Lee
Byoung-Tak Zhang
32
52
0
07 May 2020
Exploiting Inter-Frame Regional Correlation for Efficient Action
  Recognition
Exploiting Inter-Frame Regional Correlation for Efficient Action Recognition
Yuecong Xu
Jianfei Yang
K. Mao
Jianxiong Yin
Simon See
8
11
0
06 May 2020
Recognizing American Sign Language Nonmanual Signal Grammar Errors in
  Continuous Videos
Recognizing American Sign Language Nonmanual Signal Grammar Errors in Continuous Videos
Elahe Vahdani
Longlong Jing
Yingli Tian
Matt Huenerfauth
9
8
0
01 May 2020
The AVA-Kinetics Localized Human Actions Video Dataset
The AVA-Kinetics Localized Human Actions Video Dataset
Ang Li
Meghana Thotakuri
David A. Ross
João Carreira
Alexander Vostrikov
Andrew Zisserman
VGen
19
133
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
62
494
0
01 May 2020
Beyond Instructional Videos: Probing for More Diverse Visual-Textual
  Grounding on YouTube
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Jack Hessel
Zhenhai Zhu
Bo Pang
Radu Soricut
23
4
0
29 Apr 2020
Audio-Visual Instance Discrimination with Cross-Modal Agreement
Audio-Visual Instance Discrimination with Cross-Modal Agreement
Pedro Morgado
Nuno Vasconcelos
Ishan Misra
SSL
33
270
0
27 Apr 2020
Gabriella: An Online System for Real-Time Activity Detection in
  Untrimmed Security Videos
Gabriella: An Online System for Real-Time Activity Detection in Untrimmed Security Videos
Mamshad Nayeem Rizve
Ugur Demir
Praveen Tirupattur
A. J. Rana
Kevin Duarte
Ishan R. Dave
Yogesh S Rawat
M. Shah
13
19
0
23 Apr 2020
TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition
TAEN: Temporal Aware Embedding Network for Few-Shot Action Recognition
Rami Ben-Ari
Mor Shpigel
Ophir Azulai
Udi Barzelay
Daniel Rotman
ViT
31
25
0
21 Apr 2020
Adversarial Distortion for Learned Video Compression
Adversarial Distortion for Learned Video Compression
Vijay Veerabadran
Reza Pourreza
A. Habibian
Taco S. Cohen
GAN
43
13
0
20 Apr 2020
CatNet: Class Incremental 3D ConvNets for Lifelong Egocentric Gesture
  Recognition
CatNet: Class Incremental 3D ConvNets for Lifelong Egocentric Gesture Recognition
Zhengwei Wang
Qi She
Tejo Chalasani
A. Smolic
3DPC
SLR
27
15
0
20 Apr 2020
ImagePairs: Realistic Super Resolution Dataset via Beam Splitter Camera
  Rig
ImagePairs: Realistic Super Resolution Dataset via Beam Splitter Camera Rig
Hamid Reza Vaezi Joze
Ilya Zharkov
Karlton D. Powell
Carl Ringler
Luming Liang
A. Roulston
Moshe R. Lutz
V. Pradeep
SupR
11
30
0
18 Apr 2020
FineGym: A Hierarchical Video Dataset for Fine-grained Action
  Understanding
FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding
Dian Shao
Yue Zhao
Bo Dai
Dahua Lin
9
321
0
14 Apr 2020
SpeedNet: Learning the Speediness in Videos
SpeedNet: Learning the Speediness in Videos
Sagie Benaim
Ariel Ephrat
Oran Lang
Inbar Mosseri
William T. Freeman
Michael Rubinstein
Michal Irani
Tali Dekel
28
257
0
13 Apr 2020
Self-supervised Feature Learning by Cross-modality and Cross-view
  Correspondences
Self-supervised Feature Learning by Cross-modality and Cross-view Correspondences
Longlong Jing
Yucheng Chen
Ling Zhang
Mingyi He
Yingli Tian
3DPC
SSL
22
34
0
13 Apr 2020
Improved Residual Networks for Image and Video Recognition
Improved Residual Networks for Image and Video Recognition
Ionut Cosmin Duta
Li Liu
Fan Zhu
Ling Shao
SSeg
AI4TS
20
170
0
10 Apr 2020
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Spatiotemporal Fusion in 3D CNNs: A Probabilistic View
Yizhou Zhou
Xiaoyan Sun
Chong Luo
Zhengjun Zha
Wenjun Zeng
3DPC
19
20
0
10 Apr 2020
Feedback Recurrent Autoencoder for Video Compression
Feedback Recurrent Autoencoder for Video Compression
Adam Goliñski
Reza Pourreza
Yang Yang
Guillaume Sautière
Taco S. Cohen
VGen
41
48
0
09 Apr 2020
When, Where, and What? A New Dataset for Anomaly Detection in Driving
  Videos
When, Where, and What? A New Dataset for Anomaly Detection in Driving Videos
Yu Yao
Xizi Wang
Mingze Xu
Zelin Pu
E. Atkins
David J. Crandall
40
44
0
06 Apr 2020
TimeGate: Conditional Gating of Segments in Long-range Activities
TimeGate: Conditional Gating of Segments in Long-range Activities
Noureldien Hussein
Mihir Jain
B. Bejnordi
AI4TS
18
16
0
03 Apr 2020
Temporal Accumulative Features for Sign Language Recognition
Temporal Accumulative Features for Sign Language Recognition
A. Kındıroglu
Ogulcan Özdemir
L. Akarun
SLR
16
18
0
02 Apr 2020
SCT: Set Constrained Temporal Transformer for Set Supervised Action
  Segmentation
SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation
Mohsen Fayyaz
Juergen Gall
ViT
17
70
0
31 Mar 2020
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
  Recognition
Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition
Ziyu Liu
Hongwen Zhang
Zhenghao Chen
Zhiyong Wang
Wanli Ouyang
32
818
0
31 Mar 2020
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Spatio-Temporal Graph for Video Captioning with Knowledge Distillation
Boxiao Pan
Haoye Cai
De-An Huang
Kuan-Hui Lee
Adrien Gaidon
Ehsan Adeli
Juan Carlos Niebles
31
235
0
31 Mar 2020
TITAN: Future Forecast using Action Priors
TITAN: Future Forecast using Action Priors
Srikanth Malla
Behzad Dariush
Chiho Choi
14
121
0
31 Mar 2020
Speech2Action: Cross-modal Supervision for Action Recognition
Speech2Action: Cross-modal Supervision for Action Recognition
Arsha Nagrani
Chen Sun
David A. Ross
Rahul Sukthankar
Cordelia Schmid
Andrew Zisserman
33
54
0
30 Mar 2020
Learning a Weakly-Supervised Video Actor-Action Segmentation Model with
  a Wise Selection
Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection
Jie Chen
Zhiheng Li
Jiebo Luo
Chenliang Xu
27
13
0
29 Mar 2020
Omni-sourced Webly-supervised Learning for Video Recognition
Omni-sourced Webly-supervised Learning for Video Recognition
Haodong Duan
Yue Zhao
Yuanjun Xiong
Wentao Liu
Dahua Lin
VLM
23
88
0
29 Mar 2020
Weakly-Supervised Action Localization by Generative Attention Modeling
Weakly-Supervised Action Localization by Generative Attention Modeling
Baifeng Shi
Qi Dai
Yadong Mu
Jingdong Wang
WSOL
10
146
0
27 Mar 2020
Grounded Situation Recognition
Grounded Situation Recognition
Sarah M Pratt
Mark Yatskar
Luca Weihs
Ali Farhadi
Aniruddha Kembhavi
25
112
0
26 Mar 2020
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Coronary Artery Segmentation in Angiographic Videos Using A 3D-2D CE-Net
Lu Wang
Dongxue Liang
Xiao-Lei Yin
Jing Qiu
Zhi-Yun Yang
Jun-Hui Xing
Jian-Zeng Dong
Zhao-Yuan Ma
MedIm
21
0
0
26 Mar 2020
Previous
123...323334...394041
Next