Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.06950
Cited By
The Kinetics Human Action Video Dataset
19 May 2017
W. Kay
João Carreira
Karen Simonyan
Brian Zhang
Chloe Hillier
Sudheendra Vijayanarasimhan
Fabio Viola
Tim Green
T. Back
Apostol Natsev
Mustafa Suleyman
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Kinetics Human Action Video Dataset"
50 / 2,016 papers shown
Title
Input Compression with Positional Consistency for Efficient Training and Inference of Transformer Neural Networks
Amrit Nagarajan
Anand Raghunathan
VLM
ViT
31
0
0
22 Nov 2023
Quantifying Impairment and Disease Severity Using AI Models Trained on Healthy Subjects
Boyang Yu
Aakash Kaku
Kangning Liu
A. Parnandi
Emily E Fokas
Anita Venkatesan
Natasha Pandit
Rajesh Ranganath
Heidi M. Schambra
C. Fernandez‐Granda
29
0
0
21 Nov 2023
GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap
Hyogun Lee
Kyungho Bae
Seong Jong Ha
Yumin Ko
Gyeong-Moon Park
Jinwoo Choi
19
2
0
21 Nov 2023
Fingerspelling PoseNet: Enhancing Fingerspelling Translation with Pose-Based Transformer Models
Pooya Fayyazsanavi
Negar Nejatishahidin
Jana Kosecka
SLR
34
1
0
20 Nov 2023
A Multi-In-Single-Out Network for Video Frame Interpolation without Optical Flow
Jaemin Lee
Min-seok Seo
Sangwoo Lee
Hyobin Park
Dong-Geol Choi
30
0
0
20 Nov 2023
HIDRO-VQA: High Dynamic Range Oracle for Video Quality Assessment
Shreshth Saini
Avinab Saha
A. Bovik
11
4
0
18 Nov 2023
Breaking Temporal Consistency: Generating Video Universal Adversarial Perturbations Using Image Models
Heeseon Kim
Minji Son
Minbeom Kim
Myung-Joon Kwon
Changick Kim
AAML
37
7
0
17 Nov 2023
JWSign: A Highly Multilingual Corpus of Bible Translations for more Diversity in Sign Language Processing
Shester Gueuwou
Sophie Siake
Colin Leong
Mathias Müller
SLR
40
11
0
16 Nov 2023
VideoCon: Robust Video-Language Alignment via Contrast Captions
Hritik Bansal
Yonatan Bitton
Idan Szpektor
Kai-Wei Chang
Aditya Grover
46
15
0
15 Nov 2023
CLiF-VQA: Enhancing Video Quality Assessment by Incorporating High-Level Semantic Information related to Human Feelings
Yachun Mi
Yu Li
Yan Shu
Chen Hui
Puchao Zhou
Shaohui Liu
41
7
0
13 Nov 2023
PECoP: Parameter Efficient Continual Pretraining for Action Quality Assessment
Amirhossein Dadashzadeh
Shuchao Duan
Alan Whone
Majid Mirmehdi
27
7
0
11 Nov 2023
PolyMaX: General Dense Prediction with Mask Transformer
Xuan S. Yang
Liangzhe Yuan
Kimberly Wilber
Astuti Sharma
Xiuye Gu
...
Stephanie Debats
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Liang-Chieh Chen
39
14
0
09 Nov 2023
CLearViD: Curriculum Learning for Video Description
Cheng-Yu Chuang
Pooyan Fazli
43
1
0
08 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
Siddharth Srivastava
Gaurav Sharma
SSL
42
64
0
07 Nov 2023
ACQUIRED: A Dataset for Answering Counterfactual Questions In Real-Life Videos
Te-Lin Wu
Zi-Yi Dou
Qingyuan Hu
Yu Hou
Nischal Reddy Chandra
Marjorie Freedman
R. Weischedel
Nanyun Peng
44
5
0
02 Nov 2023
POS: A Prompts Optimization Suite for Augmenting Text-to-Video Generation
Shijie Ma
Huayi Xu
Mengjian Li
Weidong Geng
Yaxiong Wang
Meng Wang
DiffM
VGen
19
0
0
02 Nov 2023
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab
Jieming Cui
Ziren Gong
Baoxiong Jia
Siyuan Huang
Zilong Zheng
Jianzhu Ma
Yixin Zhu
47
3
0
01 Nov 2023
Object-centric Video Representation for Long-term Action Anticipation
Ce Zhang
Changcheng Fu
Shijie Wang
Nakul Agarwal
Kwonjoon Lee
Chiho Choi
Chen Sun
50
14
0
31 Oct 2023
SimMMDG: A Simple and Effective Framework for Multi-modal Domain Generalization
Hao Dong
Ismail Nejjar
Han Sun
Eleni Chatzi
Olga Fink
36
19
0
30 Oct 2023
A Hybrid Graph Network for Complex Activity Detection in Video
Salman Khan
Izzeddin Teeti
Andrew Bradley
Mohamed Elhoseiny
Fabio Cuzzolin
34
2
0
26 Oct 2023
PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications
Yang Tan
Mingchen Li
P. Tan
Ziyi Zhou
Huiqun Yu
Guisheng Fan
Liang Hong
31
0
0
26 Oct 2023
IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting
Tim J. Schoonbeek
Tim Houben
H. Onvlee
Peter H. N. de With
Fons van der Sommen
54
23
0
26 Oct 2023
Affective Video Content Analysis: Decade Review and New Perspectives
Junxiao Xue
Jie Wang
Xuecheng Wu
Qian Zhang
25
0
0
26 Oct 2023
Towards Control-Centric Representations in Reinforcement Learning from Images
Chen Liu
Hongyu Zang
Xin Li
Yong Heng
Yifei Wang
Zhen Fang
Yisen Wang
Mingzhong Wang
33
0
0
25 Oct 2023
ChimpACT: A Longitudinal Dataset for Understanding Chimpanzee Behaviors
Xiaoxuan Ma
Stephan P. Kaufhold
Jiajun Su
Wentao Zhu
Jack Terwilliger
Andres Meza
Yixin Zhu
Federico Rossano
Yizhou Wang
21
14
0
25 Oct 2023
Geometry-Aware Video Quality Assessment for Dynamic Digital Human
Zicheng Zhang
Yingjie Zhou
Wei Sun
Xiongkuo Min
Guangtao Zhai
30
8
0
24 Oct 2023
Videoprompter: an ensemble of foundational models for zero-shot video understanding
Adeel Yousaf
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
Mubarak Shah
VLM
38
2
0
23 Oct 2023
S3Aug: Segmentation, Sampling, and Shift for Action Recognition
Taiki Sugiura
Toru Tamaki
AI4TS
31
2
0
23 Oct 2023
NurViD: A Large Expert-Level Video Database for Nursing Procedure Activity Understanding
Ming Hu
Lin Wang
Siyuan Yan
Don Ma
Qingli Ren
Peng Xia
Wei Feng
Peibo Duan
Lie Ju
Zongyuan Ge
27
13
0
20 Oct 2023
Human Pose-based Estimation, Tracking and Action Recognition with Deep Learning: A Survey
Lijuan Zhou
Xiang Meng
Zhihuan Liu
Mengqi Wu
Zhimin Gao
Pichao Wang
49
3
0
19 Oct 2023
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Yaofang Liu
Xiaodong Cun
Xuebo Liu
Xintao Wang
Yong Zhang
Haoxin Chen
Yang Liu
Tieyong Zeng
Raymond H. F. Chan
Ying Shan
VGen
EGVM
32
129
0
17 Oct 2023
Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video
Shashanka Venkataramanan
Mamshad Nayeem Rizve
João Carreira
Yuki M. Asano
Yannis Avrithis
SSL
42
18
0
12 Oct 2023
Watt For What: Rethinking Deep Learning's Energy-Performance Relationship
Shreyank N. Gowda
Xinyue Hao
Gen Li
Laura Sevilla-Lara
Shashank Narayana Gowda
HAI
18
10
0
10 Oct 2023
Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing
Wei Dong
Dawei Yan
Zhijun Lin
Peng Wang
27
21
0
10 Oct 2023
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Lijun Yu
José Lezama
N. B. Gundavarapu
Luca Versari
Kihyuk Sohn
...
Boqing Gong
Ming-Hsuan Yang
Irfan Essa
David A. Ross
Lu Jiang
32
285
0
09 Oct 2023
Learning Generalizable Agents via Saliency-Guided Features Decorrelation
Sili Huang
Yanchao Sun
Jifeng Hu
Siyuan Guo
Hechang Chen
Yi-Ju Chang
Lichao Sun
Bo Yang
26
5
0
08 Oct 2023
Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data
Zuxuan Wu
Zejia Weng
Wujian Peng
Xitong Yang
Ang Li
Larry S. Davis
Yu-Gang Jiang
CLIP
VLM
41
21
0
08 Oct 2023
Multiple Physics Pretraining for Physical Surrogate Models
Michael McCabe
Bruno Régaldo-Saint Blancard
Liam Parker
Ruben Ohana
M. Cranmer
...
Francois Lanusse
Mariel Pettee
Tiberiu Teşileanu
Kyunghyun Cho
Shirley Ho
PINN
AI4CE
40
54
0
04 Oct 2023
Delving into CLIP latent space for Video Anomaly Recognition
Luca Zanella
Benedetta Liberatori
Willi Menapace
Fabio Poiesi
Yiming Wang
Elisa Ricci
31
23
0
04 Oct 2023
How Physics and Background Attributes Impact Video Transformers in Robotic Manipulation: A Case Study on Planar Pushing
Shutong Jin
Ruiyu Wang
Muhammad Zahid
Florian T. Pokorny
38
1
0
03 Oct 2023
Beyond the Benchmark: Detecting Diverse Anomalies in Videos
Yoav Arad
Michael Werman
16
2
0
03 Oct 2023
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu
Bin Lin
Munan Ning
Yang Yan
Jiaxi Cui
...
Zongwei Li
Wancai Zhang
Zhifeng Li
Wei Liu
Liejie Yuan
VLM
MLLM
34
207
0
03 Oct 2023
Continual Action Assessment via Task-Consistent Score-Discriminative Feature Distribution Modeling
Yuan-Ming Li
Ling-an Zeng
Jing-Ke Meng
Wei-Shi Zheng
38
9
0
29 Sep 2023
Weakly-Supervised Video Anomaly Detection with Snippet Anomalous Attention
Yidan Fan
Yongxin Yu
Wenhuan Lu
Yahong Han
30
20
0
28 Sep 2023
M
3
^{3}
3
3D: Learning 3D priors using Multi-Modal Masked Autoencoders for 2D image and video understanding
Muhammad Abdullah Jamal
Omid Mohareri
3DPC
24
1
0
26 Sep 2023
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation
Suhwan Cho
Minhyeok Lee
Jungho Lee
Myeongah Cho
Seungwook Park
Jaeyeob Kim
Hyunsung Jang
Sangyoun Lee
VOS
68
2
0
26 Sep 2023
Automatic Animation of Hair Blowing in Still Portrait Photos
Wenpeng Xiao
Wentao Liu
Yitong Wang
Guohao Li
Bing Li
3DH
44
10
0
25 Sep 2023
Egocentric RGB+Depth Action Recognition in Industry-Like Settings
Jyoti Kini
Sarah Fleischer
I. Dave
Mubarak Shah
EgoV
36
2
0
25 Sep 2023
Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training
Jiangliu Wang
Jianbo Jiao
Yibing Song
Stephen James
Zhan Tong
Chongjian Ge
Pieter Abbeel
Yunhui Liu
20
0
0
25 Sep 2023
Fully Transformer-Equipped Architecture for End-to-End Referring Video Object Segmentation
P. Li
Yu Zhang
L. Yuan
Xianghua Xu
VOS
29
6
0
21 Sep 2023
Previous
1
2
3
...
9
10
11
...
39
40
41
Next