Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 1,440 papers shown
Title
TransNet: A Transfer Learning-Based Network for Human Action Recognition
Khaled Alomar
Xiaohao Cai
43
1
0
13 Sep 2023
STUPD: A Synthetic Dataset for Spatial and Temporal Relation Reasoning
Palaash Agrawal
Haidi Azaman
Cheston Tan
53
3
0
13 Sep 2023
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion
Yujin Jeong
Won-Wha Ryoo
Seunghyun Lee
Dabin Seo
Wonmin Byeon
Sangpil Kim
Jinkyu Kim
DiffM
32
29
0
08 Sep 2023
Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Jiaxi Gu
Shicong Wang
Haoyu Zhao
Tianyi Lu
Xing Zhang
Zuxuan Wu
Songcen Xu
Wei Zhang
Yu-Gang Jiang
Hang Xu
DiffM
VGen
41
44
0
07 Sep 2023
EgoPCA: A New Framework for Egocentric Hand-Object Interaction Understanding
Yue Xu
Yong-Lu Li
Zhemin Huang
Michael Xu Liu
Cewu Lu
Yu-Wing Tai
Chi-Keung Tang
EgoV
30
9
0
05 Sep 2023
Masked Feature Modelling: Feature Masking for the Unsupervised Pre-training of a Graph Attention Network Block for Bottom-up Video Event Recognition
Dimitrios Daskalakis
Nikolaos Gkalelis
Vasileios Mezaris
40
0
0
24 Aug 2023
Towards Privacy-Supporting Fall Detection via Deep Unsupervised RGB2Depth Adaptation
Hejun Xiao
Kunyu Peng
Xiangsheng Huang
Alina Roitberg
Hao Li
Zhao Wang
Rainer Stiefelhagen
26
3
0
23 Aug 2023
UnLoc: A Unified Framework for Video Localization Tasks
Shengjia Yan
Xuehan Xiong
Arsha Nagrani
Anurag Arnab
Zhonghao Wang
Weina Ge
David A. Ross
Cordelia Schmid
36
53
0
21 Aug 2023
Learnt Contrastive Concept Embeddings for Sign Recognition
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
29
5
0
18 Aug 2023
M
3
^3
3
Net: Multi-view Encoding, Matching, and Fusion for Few-shot Fine-grained Action Recognition
Hao Tang
Jun Liu
Shuanglin Yan
Rui Yan
Zechao Li
Jinhui Tang
23
38
0
06 Aug 2023
A Survey on Deep Learning-based Spatio-temporal Action Detection
Peng Wang
Fanwei Zeng
Yu Qian
34
5
0
03 Aug 2023
On Transferability of Driver Observation Models from Simulated to Real Environments in Autonomous Cars
Walter Morales-Alvarez
N. Certad
Alina Roitberg
Rainer Stiefelhagen
Cristina Olaverri-Monreal
41
2
0
31 Jul 2023
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?
Qi Zhao
Shijie Wang
Ce Zhang
Changcheng Fu
Minh Quan Do
Nakul Agarwal
Kwonjoon Lee
Chen Sun
LM&Ro
61
49
0
31 Jul 2023
Robotic Vision for Human-Robot Interaction and Collaboration: A Survey and Systematic Review
Nicole L. Robinson
Brendan Tidd
Dylan Campbell
Dana Kulić
Peter Corke
46
55
0
28 Jul 2023
Weakly Supervised AI for Efficient Analysis of 3D Pathology Samples
Andrew H. Song
Mane Williams
Drew F. K. Williamson
Guillaume Jaume
Andrew Zhang
...
R. Serafin
Jonathan T. C. Liu
Alexander S. Baras
Anil V. Parwani
Faisal Mahmood
17
4
0
27 Jul 2023
Sample Less, Learn More: Efficient Action Recognition via Frame Feature Restoration
Harry Cheng
Yangyang Guo
Liqiang Nie
Zhiyong Cheng
Mohan S. Kankanhalli
48
7
0
27 Jul 2023
Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models
Xin Yuan
Linjie Li
Jianfeng Wang
Zhengyuan Yang
Kevin Qinghong Lin
Zicheng Liu
Lijuan Wang
DiffM
65
6
0
27 Jul 2023
Language-based Action Concept Spaces Improve Video Self-Supervised Learning
Kanchana Ranasinghe
Michael S. Ryoo
SSL
VLM
45
12
0
20 Jul 2023
NTIRE 2023 Quality Assessment of Video Enhancement Challenge
Xiaohong Liu
Xiongkuo Min
Wei Sun
Yulun Zhang
Peng Sun
...
Te Shi
Azadeh Mansouri
Hossein Motamednia
Amirhossein Bakhtiari
Ahmad Mahmoudi-Aznaveh
36
18
0
19 Jul 2023
What Can Simple Arithmetic Operations Do for Temporal Modeling?
Wenhao Wu
Yuxin Song
Zhun Sun
Jingdong Wang
Chang Xu
Wanli Ouyang
42
8
0
18 Jul 2023
AltFreezing for More General Video Face Forgery Detection
Zhendong Wang
Jianmin Bao
Wen-gang Zhou
Weilun Wang
Houqiang Li
ViT
CVBM
39
66
0
17 Jul 2023
SoccerKDNet: A Knowledge Distillation Framework for Action Recognition in Soccer Videos
S. Bose
Saikat Sarkar
A. Chakrabarti
34
1
0
15 Jul 2023
Bidirectionally Deformable Motion Modulation For Video-based Human Pose Transfer
W. Yu
L. Po
Ray C. C. Cheung
Yuzhi Zhao
Yu-Zhi Xue
Kun-Jhih Li
3DH
41
22
0
15 Jul 2023
CoTracker: It is Better to Track Together
Nikita Karaev
Ignacio Rocco
Benjamin Graham
Natalia Neverova
Andrea Vedaldi
Christian Rupprecht
VOT
ViT
53
246
0
14 Jul 2023
Multimodal Distillation for Egocentric Action Recognition
Gorjan Radevski
Dusan Grujicic
Marie-Francine Moens
Matthew Blaschko
Tinne Tuytelaars
EgoV
30
23
0
14 Jul 2023
Video-FocalNets: Spatio-Temporal Focal Modulation for Video Action Recognition
Syed Talal Wasim
Muhammad Uzair Khattak
Muzammal Naseer
Salman Khan
M. Shah
Fahad Shahbaz Khan
ViT
54
19
0
13 Jul 2023
Bidirectional Correlation-Driven Inter-Frame Interaction Transformer for Referring Video Object Segmentation
Meng Lan
Fu Rong
Zuchao Li
Wei Yu
Lefei Zhang
VOS
36
5
0
02 Jul 2023
The STOIC2021 COVID-19 AI challenge: applying reusable training methodologies to private data
L. Boulogne
Julian Lorenz
Daniel Kienzle
Robin Schon
K. Ludwig
...
C. Russ
R. Ionasec
Nikos Paragios
Bram van Ginneken
Marieke Dubois
31
4
0
18 Jun 2023
Enhancing the Prediction of Emotional Experience in Movies using Deep Neural Networks: The Significance of Audio and Language
Sogand Mohammadi
M. G. Orimi
Hamid R. Rabiee
29
0
0
17 Jun 2023
Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment
Zihui Xue
Kristen Grauman
EgoV
43
31
0
08 Jun 2023
Atrial Septal Defect Detection in Children Based on Ultrasound Video Using Multiple Instances Learning
Yiman Liu
Qingming Huang
Xiaoxiang Han
Tongtong Liang
Zhi-fang Zhang
...
Angelos Stefanidis
Jionglong Su
Jiangang Chen
Qingli Li
Yuqi Zhang
27
7
0
06 Jun 2023
Human-Object Interaction Prediction in Videos through Gaze Following
Zhifan Ni
Esteve Valls Mascaro
Hyemin Ahn
Dongheui Lee
32
10
0
06 Jun 2023
VideoComposer: Compositional Video Synthesis with Motion Controllability
Xiang Wang
Hangjie Yuan
Shiwei Zhang
Dayou Chen
Jiuniu Wang
Yingya Zhang
Yujun Shen
Deli Zhao
Jingren Zhou
VGen
DiffM
33
319
0
03 Jun 2023
DeepFake-Adapter: Dual-Level Adapter for DeepFake Detection
Rui Shao
Tianxing Wu
Liqiang Nie
Ziwei Liu
34
11
0
01 Jun 2023
A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition
Shentong Mo
Pedro Morgado
38
21
0
30 May 2023
CVB: A Video Dataset of Cattle Visual Behaviors
Ali Zia
Renuka Sharma
Reza Arablouei
G. Bishop-Hurley
Jody McNally
N. Bagnall
V. Rolland
Brano Kusy
L. Petersson
A. Ingham
34
2
0
26 May 2023
Deep Neural Networks in Video Human Action Recognition: A Review
Zihan Wang
Yang Yang
Zhi Liu
Y. Zheng
61
4
0
25 May 2023
Audio-Visual Dataset and Method for Anomaly Detection in Traffic Videos
Błażej Leporowski
Arian Bakhtiarnia
Nicole Bonnici
A. Muscat
Luca Zanella
Yiming Wang
Alexandros Iosifidis
35
1
0
24 May 2023
Slovo: Russian Sign Language Dataset
A. Kapitanov
Karina Kvanchiani
A.M. Nagaev
Elizaveta Petrova
SLR
13
10
0
23 May 2023
Enhancing Transformer Backbone for Egocentric Video Action Segmentation
Sakib Reza
Balaji Sundareshan
Mohsen Moghaddam
Mario Sznaier
ViT
30
4
0
19 May 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
50
116
0
18 May 2023
Is end-to-end learning enough for fitness activity recognition?
Antoine Mercier
Guillaume Berger
Sunny Panchal
Florian Letsch
Cornelius Boehm
Nahua Kang
Ingo Bax
Roland Memisevic
28
2
0
14 May 2023
Video-Specific Query-Key Attention Modeling for Weakly-Supervised Temporal Action Localization
Xijun Wang
Aggelos K. Katsaggelos
34
0
0
07 May 2023
ItoV: Efficiently Adapting Deep Learning-based Image Watermarking to Video Watermarking
Guanhui Ye
Jiashi Gao
Yuchen Wang
Liyan Song
Xue-Ming Wei
35
3
0
04 May 2023
Improve Video Representation with Temporal Adversarial Augmentation
Jinhao Duan
Quanfu Fan
Hao-Ran Cheng
Xiaoshuang Shi
Kaidi Xu
AAML
AI4TS
ViT
33
2
0
28 Apr 2023
Learning Human-Human Interactions in Images from Weak Textual Supervision
Morris Alper
Hadar Averbuch-Elor
VLM
52
2
0
27 Apr 2023
GoferBot: A Visual Guided Human-Robot Collaborative Assembly System
Zheyu Zhuang
Yizhak Ben-Shabat
Jiahao Zhang
Stephen Gould
Robert E. Mahony
40
6
0
18 Apr 2023
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
A. Blattmann
Robin Rombach
Huan Ling
Tim Dockhorn
Seung Wook Kim
Sanja Fidler
Karsten Kreis
3DGS
VGen
130
1,019
0
18 Apr 2023
Morph-SSL: Self-Supervision with Longitudinal Morphing to Predict AMD Progression from OCT
A. Chakravarty
T. Emre
Oliver Leingang
Sophie Riedl
Julia Mai
...
S. Sivaprasad
Daniel Rueckert
A. Lotery
U. Schmidt-Erfurth
Hrvoje Bogunović
36
1
0
17 Apr 2023
Soundini: Sound-Guided Diffusion for Natural Video Editing
Seung Hyun Lee
Si-Yeol Kim
Innfarn Yoo
Feng Yang
Donghyeon Cho
Youngseo Kim
Huiwen Chang
Jinkyu Kim
Sangpil Kim
VGen
DiffM
39
15
0
13 Apr 2023
Previous
1
2
3
4
5
6
...
27
28
29
Next