ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,645 papers shown
Title
OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition
OwlSight: A Robust Illumination Adaptation Framework for Dark Video Human Action Recognition
Shihao Cheng
Jinlu Zhang
Yue Liu
Zhigang Tu
VLM
65
0
0
30 Mar 2025
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Wei-Jin Huang
Yuan-Ming Li
Zhi-Wei Xia
Yu-Ming Tang
Kun-Yu Lin
Jian-Fang Hu
Wei-Shi Zheng
103
0
0
28 Mar 2025
Comparative Analysis of Image, Video, and Audio Classifiers for Automated News Video Segmentation
Comparative Analysis of Image, Video, and Audio Classifiers for Automated News Video Segmentation
Jonathan Attard
Dylan Seychell
107
0
0
27 Mar 2025
Vision-to-Music Generation: A Survey
Vision-to-Music Generation: A Survey
Zhaokai Wang
Chenxi Bao
Le Zhuo
Jingrui Han
Yang Yue
Yihong Tang
Victor Shea-Jay Huang
Yue Liao
EGVMVGen
136
1
0
27 Mar 2025
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning
Chi-Hsi Kung
Frangil Ramirez
Juhyung Ha
Yi-Ting Chen
David J. Crandall
Yi-Hsuan Tsai
139
1
0
27 Mar 2025
Incremental Object Keypoint Learning
Incremental Object Keypoint Learning
Mingfu Liang
Jiahuan Zhou
Xu Zou
Ying Wu
CLL
122
0
0
26 Mar 2025
BEAR: A Video Dataset For Fine-grained Behaviors Recognition Oriented with Action and Environment Factors
BEAR: A Video Dataset For Fine-grained Behaviors Recognition Oriented with Action and Environment Factors
Chengyang Hu
Yuduo Chen
Lizhuang Ma
96
0
0
26 Mar 2025
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Mamba-3D as Masked Autoencoders for Accurate and Data-Efficient Analysis of Medical Ultrasound Videos
Jiaheng Zhou
Yanfeng Zhou
Wei Fang
Yuxing Tang
Le Lu
Ge Yang
Mamba
515
0
0
26 Mar 2025
Video Motion Graphs
Video Motion Graphs
Haiyang Liu
Zhan Xu
Fa-Ting Hong
Hsin-Ping Huang
Yi Zhou
Yang Zhou
DiffMVGen
155
1
0
26 Mar 2025
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings
Chengan Che
Chao Wang
Tom Vercauteren
Sophia Tsoka
Luis C. Garcia-Peraza-Herrera
MedIm
84
1
0
25 Mar 2025
Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better
Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better
Zihang Lai
Andrea Vedaldi
71
1
0
25 Mar 2025
LLaVAction: evaluating and training multi-modal large language models for action recognition
LLaVAction: evaluating and training multi-modal large language models for action recognition
Shaokai Ye
Haozhe Qi
Alexander Mathis
Mackenzie W. Mathis
134
1
0
24 Mar 2025
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
Wencheng Zhu
Yuexin Wang
Hongxuan Li
Pengfei Zhu
Q. Hu
CLIP
111
0
0
24 Mar 2025
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding
Xiangrui Liu
Yan Shu
Zhengyang Liang
Ao Li
Yang Tian
Bo Zhao
VGenVLM
274
9
0
24 Mar 2025
Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation
Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation
Zhanzhong Pang
Fadime Sener
Shrinivas Ramasubramanian
Angela Yao
112
1
0
24 Mar 2025
ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset
ATARS: An Aerial Traffic Atomic Activity Recognition and Temporal Segmentation Dataset
Zihao Chen
Hsuanyu Wu
Chi-Hsi Kung
Yi-Ting Chen
Yan-Tsung Peng
80
1
0
24 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
186
8
0
24 Mar 2025
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang
Fadime Sener
Angela Yao
OffRL
125
2
0
24 Mar 2025
What Time Tells Us? An Explorative Study of Time Awareness Learned from Static Images
What Time Tells Us? An Explorative Study of Time Awareness Learned from Static Images
Dongheng Lin
Han Hu
Jianbo Jiao
63
0
0
23 Mar 2025
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization
Collaborative Temporal Consistency Learning for Point-supervised Natural Language Video Localization
Zhuo Tao
Liang Li
Qi Chen
Yunbin Tu
Zheng-Jun Zha
Ming-Hsuan Yang
Yuankai Qi
Qingming Huang
77
0
0
22 Mar 2025
Joint Self-Supervised Video Alignment and Action Segmentation
Joint Self-Supervised Video Alignment and Action Segmentation
Ali Shah Ali
Syed Ahmed Mahmood
Mubin Saeed
Andrey Konin
M. Zia
Quoc-Huy Tran
OT
104
0
0
21 Mar 2025
Agentic Keyframe Search for Video Question Answering
Agentic Keyframe Search for Video Question Answering
Sunqi Fan
Meng-Hao Guo
Shuojin Yang
81
0
0
20 Mar 2025
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
Zichen Liu
Kunlun Xu
Fuchun Sun
Xu Zou
Yuxin Peng
Jiahuan Zhou
VLMAI4TS
193
2
0
20 Mar 2025
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
Haiguang Wang
Daqi Liu
Hongwei Xie
Haisong Liu
Enhui Ma
Kaicheng Yu
Limin Wang
Bing Wang
VGen
119
2
0
20 Mar 2025
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers
DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers
Mert Bulent Sariyildiz
Philippe Weinzaepfel
Thomas Lucas
Pau de Jorge
Diane Larlus
Yannis Kalantidis
110
0
0
18 Mar 2025
Condensing Action Segmentation Datasets via Generative Network Inversion
Condensing Action Segmentation Datasets via Generative Network Inversion
Guodong Ding
Rongyu Chen
Angela Yao
DD
147
1
0
18 Mar 2025
A Real-Time Human Action Recognition Model for Assisted Living
A Real-Time Human Action Recognition Model for Assisted Living
Yixuan Wang
Paul Stynes
Pramod Pathak
Cristina Muntean
60
0
0
18 Mar 2025
GIFT: Generated Indoor video frames for Texture-less point tracking
GIFT: Generated Indoor video frames for Texture-less point tracking
Jianzheng Huang
Xianyu Mo
Ziling Liu
Jinyu Yang
Feng Zheng
DiffM3DPC3DVVGen
99
0
0
17 Mar 2025
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Shristi Das Biswas
Efstathia Soufleri
Arani Roy
Kaushik Roy
116
0
0
17 Mar 2025
Cross-Modal Consistency Learning for Sign Language Recognition
Cross-Modal Consistency Learning for Sign Language Recognition
Kepeng Wu
Zecheng Li
Weichao Zhao
Hezhen Hu
Wengang Zhou
SLR
96
0
0
16 Mar 2025
ISLR101: an Iranian Word-Level Sign Language Recognition Dataset
ISLR101: an Iranian Word-Level Sign Language Recognition Dataset
Hossein Ranjbar
Alireza Taheri
SLR
88
0
0
16 Mar 2025
Multi Activity Sequence Alignment via Implicit Clustering
Multi Activity Sequence Alignment via Implicit Clustering
Taein Kwon
Zador Pataki
Mahdi Rad
Marc Pollefeys
HAIAI4TS
103
0
0
16 Mar 2025
Domain Generalization for Improved Human Activity Recognition in Office Space Videos Using Adaptive Pre-processing
Domain Generalization for Improved Human Activity Recognition in Office Space Videos Using Adaptive Pre-processing
Partho Ghosh
Raisa Bentay Hossain
Mohammad Zunaed
Taufiq Hasan
101
0
0
16 Mar 2025
VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining
Yunze Liu
Peiran Wu
C. Liang
Junxiao Shen
Limin Wang
Li Yi
Mamba
161
1
0
16 Mar 2025
EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
Jianwu Fang
Lei-lei Li
Zhedong Zheng
Hongkai Yu
Jianru Xue
Zhengguo Li
Tat-Seng Chua
21
0
0
16 Mar 2025
Salient Temporal Encoding for Dynamic Scene Graph Generation
Salient Temporal Encoding for Dynamic Scene Graph Generation
Zhihao Zhu
91
0
0
15 Mar 2025
A Large-Scale Study on Video Action Dataset Condensation
A Large-Scale Study on Video Action Dataset Condensation
Yang Chen
Sheng Guo
Bo Zheng
Limin Wang
DD
169
3
0
13 Mar 2025
R^RRFLAV: Rolling Flow matching for infinite Audio Video generation
Alex Ergasti
Giuseppe Tarollo
Filippo Botti
Tomaso Fontanini
Claudio Ferrari
Massimo Bertozzi
Andrea Prati
VGen
80
0
0
13 Mar 2025
PromptGAR: Flexible Promptive Group Activity Recognition
Zhangyu Jin
Andrew Feng
Ankur Chemburkar
Celso M. De Melo
VLM
101
0
0
11 Mar 2025
SignRep: Enhancing Self-Supervised Sign Representations
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
SLR
167
1
0
11 Mar 2025
STEAD: Spatio-Temporal Efficient Anomaly Detection for Time and Compute Sensitive Applications
Andrew Gao
Jun Liu
AI4TS
96
0
0
11 Mar 2025
Analysis of 3D Urticaceae Pollen Classification Using Deep Learning Models
Tijs Konijn
Imaan Bijl
Lu Cao
Fons Verbeek
109
0
0
10 Mar 2025
COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition
Baiyu Chen
Wilson Wongso
Zechen Li
Yonchanok Khaokaew
Hao Xue
Flora D. Salim
168
1
0
10 Mar 2025
Sign Language Translation using Frame and Event Stream: Benchmark Dataset and Algorithms
Xinyu Wang
Yuchen Li
Fuling Wang
Bo Jiang
Yansen Wang
Yonghong Tian
Jin Tang
Bin Luo
SLR
104
1
0
09 Mar 2025
Online Dense Point Tracking with Streaming Memory
Qiaole Dong
Yanwei Fu
76
0
0
09 Mar 2025
SGA-INTERACT: A 3D Skeleton-based Benchmark for Group Activity Understanding in Modern Basketball Tactic
Yue Yang
Wei Wang
Yifei Liu
Linfeng Dong
Hao Wu
Mingxin Zhang
Zhihang Zhong
Xiao-Fu Sun
82
1
0
09 Mar 2025
Object-Centric World Model for Language-Guided Manipulation
Youngjoon Jeong
Junha Chun
S. Cha
Taesup Kim
OCLVGen
402
2
0
08 Mar 2025
Get In Video: Add Anything You Want to the Video
Shaobin Zhuang
Zhipeng Huang
Binxin Yang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Chong Sun
Zheng-Jun Zha
Chen Li
Yijiao Wang
DiffMVGen
107
3
0
08 Mar 2025
End-to-End Action Segmentation Transformer
Tieqiao Wang
Sinisa Todorovic
ViT
89
0
0
08 Mar 2025
Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information
Edoardo Bianchi
Oswald Lanz
3DH
106
2
0
06 Mar 2025
Previous
123456...717273
Next