Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,645 papers shown
Title
Investigating Memorization in Video Diffusion Models
Chong Chen
Enhuai Liu
Daochang Liu
M. Shah
Chang Xu
VGen
DiffM
159
1
0
29 Oct 2024
Do Vendi Scores Converge with Finite Samples? Truncated Vendi Score for Finite-Sample Convergence Guarantees
Azim Ospanov
Farzan Farnia
210
3
0
29 Oct 2024
Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
Manuel Benavent-Lledo
David Mulero-Pérez
David Ortiz-Perez
José García Rodríguez
Antonis Argyros
69
1
0
28 Oct 2024
On Occlusions in Video Action Detection: Benchmark Datasets And Training Recipes
Rajat Modi
Vibhav Vineet
Yogesh S Rawat
86
2
0
25 Oct 2024
MM-WLAuslan: Multi-View Multi-Modal Word-Level Australian Sign Language Recognition Dataset
Xin Shen
Heming Du
Hongwei Sheng
Shuyun Wang
Hui Chen
...
Xiaobiao Du
Jiaying Ying
Ruihan Lu
Qingzheng Xu
Xin Yu
SLR
59
7
0
25 Oct 2024
Low-Latency Video Anonymization for Crowd Anomaly Detection: Privacy vs. Performance
M. Asres
Lei Jiao
C. Omlin
64
0
0
24 Oct 2024
PESFormer: Boosting Macro- and Micro-expression Spotting with Direct Timestamp Encoding
Wang-Wang Yu
Kai-Fu Yang
Xiangrui Hu
Jingwen Jiang
Hong-Mei Yan
Yong-Jie Li
62
0
0
24 Oct 2024
Deep Generative Models for 3D Medical Image Synthesis
Paul Friedrich
Yannik Frisch
P. Cattin
3DV
MedIm
92
4
0
23 Oct 2024
Are Visual-Language Models Effective in Action Recognition? A Comparative Study
Mahmoud Ali
Di Yang
François Brémond
VLM
106
0
0
22 Oct 2024
Masked Differential Privacy
David Schneider
Sina Sajadmanesh
Vikash Sehwag
Saquib Sarfraz
Rainer Stiefelhagen
Lingjuan Lyu
Vivek Sharma
63
0
0
22 Oct 2024
SpikMamba: When SNN meets Mamba in Event-based Human Action Recognition
Jiaqi Chen
Yan Yang
Shizhuo Deng
Da Teng
Liyuan Pan
Mamba
57
1
0
22 Oct 2024
Frontiers in Intelligent Colonoscopy
Ge-Peng Ji
Jingyi Liu
Peng Xu
Nick Barnes
Fahad Shahbaz Khan
Salman Khan
Deng-Ping Fan
127
5
0
22 Oct 2024
ContextDet: Temporal Action Detection with Adaptive Context Aggregation
Ning Wang
Yun Xiao
Xiaopeng Peng
Xiaojun Chang
Xuanhong Wang
Dingyi Fang
102
2
0
20 Oct 2024
Human Action Anticipation: A Survey
Bolin Lai
Sam Toyer
Tushar Nagarajan
Rohit Girdhar
S. Zha
James M. Rehg
Kris Kitani
Kristen Grauman
Ruta Desai
Miao Liu
AI4TS
80
1
0
17 Oct 2024
MotionBank: A Large-scale Video Motion Benchmark with Disentangled Rule-based Annotations
Liang Xu
Shaoyang Hua
Zili Lin
Yifan Liu
Feipeng Ma
Yichao Yan
Xin Jin
Xiaokang Yang
Wenjun Zeng
VGen
107
4
0
17 Oct 2024
CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos
Nikita Karaev
Iurii Makarov
Jianyuan Wang
Natalia Neverova
Andrea Vedaldi
Christian Rupprecht
73
68
0
15 Oct 2024
It's Just Another Day: Unique Video Captioning by Discriminative Prompting
Toby Perrett
Tengda Han
Dima Damen
Andrew Zisserman
81
3
0
15 Oct 2024
MoTE: Reconciling Generalization with Specialization for Visual-Language to Video Knowledge Transfer
Minghao Zhu
Zhengpu Wang
Mengxian Hu
Ronghao Dang
Xiao Lin
Xun Zhou
Chengju Liu
Qijun Chen
79
1
0
14 Oct 2024
Continual Learning Improves Zero-Shot Action Recognition
Shreyank N. Gowda
Davide Moltisanti
Laura Sevilla-Lara
BDL
VLM
CLL
135
1
0
14 Oct 2024
DINTR: Tracking via Diffusion-based Interpolation
Pha Nguyen
Ngan Le
J. Cothren
Alper Yilmaz
Khoa Luu
DiffM
93
1
0
14 Oct 2024
Movie Trailer Genre Classification Using Multimodal Pretrained Features
Serkan Sulun
Paula Viana
M. Davies
CLIP
74
3
0
11 Oct 2024
Understanding Human Activity with Uncertainty Measure for Novelty in Graph Convolutional Networks
Hao Xing
Darius Burschka
68
1
0
10 Oct 2024
Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network
Hao Xing
Darius Burschka
91
12
0
10 Oct 2024
Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras
Friedhelm Hamann
Suman Ghosh
Ignacio Juarez Martinez
Tom Hart
Alex Kacelnik
Guillermo Gallego
58
1
0
09 Oct 2024
Language-Assisted Human Part Motion Learning for Skeleton-Based Temporal Action Segmentation
Bowen Chen
Haoyu Ji
Zhiyong Wang
Benjamin Filtjens
C. Wang
Weihong Ren
Bart Vanrumste
Honghai Liu
107
0
0
08 Oct 2024
MTFL: Multi-Timescale Feature Learning for Weakly-Supervised Anomaly Detection in Surveillance Videos
Yiling Zhang
Erkut Akdag
Egor Bondarev
Peter H. N. de With
AI4TS
ViT
44
1
0
08 Oct 2024
Cefdet: Cognitive Effectiveness Network Based on Fuzzy Inference for Action Detection
Zhe Luo
Weina Fu
Shuai Liu
Saeed Anwar
Muhammad Saqib
Sambit Bakshi
Khan Muhammad
66
2
0
08 Oct 2024
Enhancing Temporal Modeling of Video LLMs via Time Gating
Zi-Yuan Hu
Yiwu Zhong
Shijia Huang
Michael R. Lyu
Liwei Wang
VLM
48
0
0
08 Oct 2024
Studying and Mitigating Biases in Sign Language Understanding Models
Katherine Atwell
Danielle Bragg
Malihe Alikhani
114
0
0
07 Oct 2024
Beyond FVD: Enhanced Evaluation Metrics for Video Generation Quality
Ge Ya Luo
Gian Mario Favero
Zhi Hao Luo
Alexia Jolicoeur-Martineau
Christopher Pal
VGen
76
4
0
07 Oct 2024
L-C4: Language-Based Video Colorization for Creative and Consistent Color
Zheng Chang
Shuchen Weng
Huan Ouyang
Yu Li
Si Li
Boxin Shi
DiffM
VGen
VLM
62
0
0
07 Oct 2024
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
Haiyang Liu
Xingchao Yang
Tomoya Akiyama
Yuantian Huang
Qiaoge Li
Shigeru Kuriyama
Takafumi Taketomi
VGen
SLR
75
10
0
05 Oct 2024
Learning Humanoid Locomotion over Challenging Terrain
Ilija Radosavovic
Sarthak Kamat
Trevor Darrell
Jitendra Malik
76
14
0
04 Oct 2024
Does SpatioTemporal information benefit Two video summarization benchmarks?
Aashutosh Ganesh
Mirela Popa
Daan Odijk
Nava Tintarev
AI4TS
35
0
0
04 Oct 2024
AirLetters: An Open Video Dataset of Characters Drawn in the Air
Rishit Dagli
Guillaume Berger
Joanna Materzynska
Ingo Bax
Roland Memisevic
VGen
67
1
0
03 Oct 2024
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Boqian Wu
Q. Xiao
Shunxin Wang
N. Strisciuglio
Mykola Pechenizkiy
M. V. Keulen
Decebal Constantin Mocanu
Elena Mocanu
OOD
3DH
224
3
0
03 Oct 2024
Deep learning for action spotting in association football videos
Silvio Giancola
A. Cioppa
Bernard Ghanem
Marc Van Droogenbroeck
61
3
0
02 Oct 2024
LS-HAR: Language Supervised Human Action Recognition with Salient Fusion, Construction Sites as a Use-Case
Mohammad Mahdavian
Mohammad Loni
Mo Chen
Mo Chen
68
0
0
02 Oct 2024
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Dexuan Ding
Lei Wang
Liyun Zhu
Tom Gedeon
Piotr Koniusz
134
9
0
02 Oct 2024
Synthetic imagery for fuzzy object detection: A comparative study
Siavash H. Khajavi
Mehdi Moshtaghi
Dikai Yu
Zixuan Liu
Kary Främling
Jan Holmström
30
0
0
01 Oct 2024
AVID: Adapting Video Diffusion Models to World Models
Marc Rigter
Tarun Gupta
Agrin Hilmkil
Chao Ma
VGen
75
8
0
01 Oct 2024
Domain Aware Multi-Task Pretraining of 3D Swin Transformer for T1-weighted Brain MRI
Jonghun Kim
Mansu Kim
Hyunjin Park
MedIm
ViT
54
0
0
01 Oct 2024
Loose Social-Interaction Recognition in Real-world Therapy Scenarios
Abid Ali
Rui Dai
Ashish Marisetty
Guillaume Astruc
Monique Thonnat
J. Odobez
Susanne Thümmler
Francois Bremond
132
1
0
30 Sep 2024
REST-HANDS: Rehabilitation with Egocentric Vision Using Smartglasses for Treatment of Hands after Surviving Stroke
Wiktor Mucha
Kentaro Tanaka
M. Kampel
82
0
0
30 Sep 2024
SurgPETL: Parameter-Efficient Image-to-Surgical-Video Transfer Learning for Surgical Phase Recognition
Shu Yang
Zhiyuan Cai
Luyang Luo
Ning Ma
Shuchang Xu
Hao Chen
67
1
0
30 Sep 2024
Multimodal LLM Enhanced Cross-lingual Cross-modal Retrieval
Yabing Wang
Le Wang
Qiang-feng Zhou
Zhibin Wang
Hao Li
Gang Hua
Wei Tang
85
10
0
30 Sep 2024
Video DataFlywheel: Resolving the Impossible Data Trinity in Video-Language Understanding
Xiao Wang
Jianlong Wu
Zijia Lin
Fuzheng Zhang
Di Zhang
Liqiang Nie
VGen
65
3
0
29 Sep 2024
A vision-based framework for human behavior understanding in industrial assembly lines
Konstantinos Papoutsakis
Nikolaos Bakalos
Konstantinos Fragkoulis
Athena Zacharia
Georgia Kapetadimitri
Maria Pateraki
66
1
0
25 Sep 2024
Pose-Guided Fine-Grained Sign Language Video Generation
Tongkai Shi
Lianyu Hu
Fanhua Shang
Jichao Feng
Peidong Liu
Wei Feng
VGen
SLR
DiffM
121
2
0
25 Sep 2024
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Xinrui Zhou
Yuhao Huang
Haoran Dou
Shijing Chen
Ao Chang
...
Jie Jessie Ren
Ruobing Huang
Jun Cheng
Wufeng Xue
Dong Ni
MedIm
387
0
0
25 Sep 2024
Previous
1
2
3
...
5
6
7
...
71
72
73
Next