Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,645 papers shown
Title
OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition
Shihao Cheng
Jinlu Zhang
Yue Liu
Zhigang Tu
VLM
65
0
0
30 Mar 2025
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Wei-Jin Huang
Yuan-Ming Li
Zhi-Wei Xia
Yu-Ming Tang
Kun-Yu Lin
Jian-Fang Hu
Wei-Shi Zheng
103
0
0
28 Mar 2025
Comparative Analysis of Image, Video, and Audio Classifiers for Automated News Video Segmentation
Jonathan Attard
Dylan Seychell
107
0
0
27 Mar 2025
Vision-to-Music Generation: A Survey
Zhaokai Wang
Chenxi Bao
Le Zhuo
Jingrui Han
Yang Yue
Yihong Tang
Victor Shea-Jay Huang
Yue Liao
EGVM
VGen
136
1
0
27 Mar 2025
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung
Frangil Ramirez
Juhyung Ha
Yi-Ting Chen
David J. Crandall
Yi-Hsuan Tsai
139
1
0
27 Mar 2025
Incremental Object Keypoint Learning
Mingfu Liang
Jiahuan Zhou
Xu Zou
Ying Wu
CLL
122
0
0
26 Mar 2025
BEAR: A Video Dataset For Fine-grained Behaviors Recognition Oriented with Action and Environment Factors
Chengyang Hu
Yuduo Chen
Lizhuang Ma
96
0
0
26 Mar 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
515
0
0
26 Mar 2025
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffM
VGen
155
1
0
26 Mar 2025
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Chengan Che
Chao Wang
Tom Vercauteren
Sophia Tsoka
Luis C. Garcia-Peraza-Herrera
MedIm
84
1
0
25 Mar 2025
Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better
Zihang Lai
Andrea Vedaldi
71
1
0
25 Mar 2025
LLaVAction: evaluating and training multi-modal large language models for action recognition
Shaokai Ye
Haozhe Qi
Alexander Mathis
Mackenzie W. Mathis
134
1
0
24 Mar 2025
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
Wencheng Zhu
Yuexin Wang
Hongxuan Li
Pengfei Zhu
Q. Hu
CLIP
111
0
0
24 Mar 2025
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
Xiangrui Liu
Yan Shu
Zhengyang Liang
Ao Li
Yang Tian
Bo Zhao
VGen
VLM
274
9
0
24 Mar 2025
Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation
Zhanzhong Pang
Fadime Sener
Shrinivas Ramasubramanian
Angela Yao
112
1
0
24 Mar 2025
ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset
Zihao Chen
Hsuanyu Wu
Chi-Hsi Kung
Yi-Ting Chen
Yan-Tsung Peng
80
1
0
24 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
186
8
0
24 Mar 2025
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang
Fadime Sener
Angela Yao
OffRL
125
2
0
24 Mar 2025
What Time Tells Us? An Explorative Study of Time Awareness Learned from Static Images
Dongheng Lin
Han Hu
Jianbo Jiao
63
0
0
23 Mar 2025
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization
Zhuo Tao
Liang Li
Qi Chen
Yunbin Tu
Zheng-Jun Zha
Ming-Hsuan Yang
Yuankai Qi
Qingming Huang
77
0
0
22 Mar 2025
Joint Self-Supervised Video Alignment and Action Segmentation
Ali Shah Ali
Syed Ahmed Mahmood
Mubin Saeed
Andrey Konin
M. Zia
Quoc-Huy Tran
OT
104
0
0
21 Mar 2025
Agentic Keyframe Search for Video Question Answering
Sunqi Fan
Meng-Hao Guo
Shuojin Yang
81
0
0
20 Mar 2025
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
Zichen Liu
Kunlun Xu
Fuchun Sun
Xu Zou
Yuxin Peng
Jiahuan Zhou
VLM
AI4TS
193
2
0
20 Mar 2025
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
Haiguang Wang
Daqi Liu
Hongwei Xie
Haisong Liu
Enhui Ma
Kaicheng Yu
Limin Wang
Bing Wang
VGen
119
2
0
20 Mar 2025
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers
Mert Bulent Sariyildiz
Philippe Weinzaepfel
Thomas Lucas
Pau de Jorge
Diane Larlus
Yannis Kalantidis
110
0
0
18 Mar 2025
Condensing Action Segmentation Datasets via Generative Network Inversion
Guodong Ding
Rongyu Chen
Angela Yao
DD
147
1
0
18 Mar 2025
A Real-Time Human Action Recognition Model for Assisted Living
Yixuan Wang
Paul Stynes
Pramod Pathak
Cristina Muntean
60
0
0
18 Mar 2025
GIFT: Generated Indoor video frames for Texture-less point tracking
Jianzheng Huang
Xianyu Mo
Ziling Liu
Jinyu Yang
Feng Zheng
DiffM
3DPC
3DV
VGen
99
0
0
17 Mar 2025
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Shristi Das Biswas
Efstathia Soufleri
Arani Roy
Kaushik Roy
116
0
0
17 Mar 2025
Cross-Modal Consistency Learning for Sign Language Recognition
Kepeng Wu
Zecheng Li
Weichao Zhao
Hezhen Hu
Wengang Zhou
SLR
96
0
0
16 Mar 2025
ISLR101: an Iranian Word-Level Sign Language Recognition Dataset
Hossein Ranjbar
Alireza Taheri
SLR
88
0
0
16 Mar 2025
Multi Activity Sequence Alignment via Implicit Clustering
Taein Kwon
Zador Pataki
Mahdi Rad
Marc Pollefeys
HAI
AI4TS
103
0
0
16 Mar 2025
Domain Generalization for Improved Human Activity Recognition in Office Space Videos Using Adaptive Pre-processing
Partho Ghosh
Raisa Bentay Hossain
Mohammad Zunaed
Taufiq Hasan
101
0
0
16 Mar 2025
VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
Yunze Liu
Peiran Wu
C. Liang
Junxiao Shen
Limin Wang
Li Yi
Mamba
161
1
0
16 Mar 2025
EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
Jianwu Fang
Lei-lei Li
Zhedong Zheng
Hongkai Yu
Jianru Xue
Zhengguo Li
Tat-Seng Chua
21
0
0
16 Mar 2025
Salient Temporal Encoding for Dynamic Scene Graph Generation
Zhihao Zhu
91
0
0
15 Mar 2025
A Large-Scale Study on Video Action Dataset Condensation
Yang Chen
Sheng Guo
Bo Zheng
Limin Wang
DD
169
3
0
13 Mar 2025
R
^R
R
FLAV: Rolling Flow matching for infinite Audio Video generation
Alex Ergasti
Giuseppe Tarollo
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
VGen
80
0
0
13 Mar 2025
PromptGAR: Flexible Promptive Group Activity Recognition
Zhangyu Jin
Andrew Feng
Ankur Chemburkar
Celso M. De Melo
VLM
101
0
0
11 Mar 2025
SignRep: Enhancing Self-Supervised Sign Representations
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
SLR
167
1
0
11 Mar 2025
STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive Applications
Andrew Gao
Jun Liu
AI4TS
96
0
0
11 Mar 2025
Analysis of 3D Urticaceae Pollen Classification Using Deep Learning Models
Tijs Konijn
Imaan Bijl
Lu Cao
Fons Verbeek
109
0
0
10 Mar 2025
COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition
Baiyu Chen
Wilson Wongso
Zechen Li
Yonchanok Khaokaew
Hao Xue
Flora D. Salim
168
1
0
10 Mar 2025
Sign Language Translation using Frame and Event Stream: Benchmark Dataset and Algorithms
Xinyu Wang
Yuchen Li
Fuling Wang
Bo Jiang
Yansen Wang
Yonghong Tian
Jin Tang
Bin Luo
SLR
104
1
0
09 Mar 2025
Online Dense Point Tracking with Streaming Memory
Qiaole Dong
Yanwei Fu
76
0
0
09 Mar 2025
SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic
Yue Yang
Wei Wang
Yifei Liu
Linfeng Dong
Hao Wu
Mingxin Zhang
Zhihang Zhong
Xiao-Fu Sun
82
1
0
09 Mar 2025
Object-Centric World Model for Language-Guided Manipulation
Youngjoon Jeong
Junha Chun
S. Cha
Taesup Kim
OCL
VGen
402
2
0
08 Mar 2025
Get In Video: Add Anything You Want to the Video
Shaobin Zhuang
Zhipeng Huang
Binxin Yang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Chong Sun
Zheng-Jun Zha
Chen Li
Yijiao Wang
DiffM
VGen
107
3
0
08 Mar 2025
End-to-End Action Segmentation Transformer
Tieqiao Wang
Sinisa Todorovic
ViT
89
0
0
08 Mar 2025
Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information
Edoardo Bianchi
Oswald Lanz
3DH
106
2
0
06 Mar 2025
Previous
1
2
3
4
5
6
...
71
72
73
Next