ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
v1v2v3 (latest)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 3,645 papers shown
Title
Towards Unsupervised Model Selection for Domain Adaptive Object
  Detection
Towards Unsupervised Model Selection for Domain Adaptive Object Detection
Hengfu Yu
Jinhong Deng
Wen Li
Lixin Duan
121
0
0
23 Dec 2024
Query-centric Audio-Visual Cognition Network for Moment Retrieval,
  Segmentation and Step-Captioning
Query-centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning
Yunbin Tu
Liang-Sheng Li
Li Su
Qingming Huang
114
0
0
18 Dec 2024
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
332
2
0
18 Dec 2024
2by2: Weakly-Supervised Learning for Global Action Segmentation
2by2: Weakly-Supervised Learning for Global Action Segmentation
Elena Bueno-Benito
Mariella Dimiccoli
107
0
0
17 Dec 2024
Training Strategies for Isolated Sign Language Recognition
Training Strategies for Isolated Sign Language Recognition
Karina Kvanchiani
Roman Kraynov
Elizaveta Petrova
Petr Surovcev
Aleksandr Nagaev
A. Kapitanov
163
1
0
16 Dec 2024
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition
Uni-AdaFocus: Spatial-temporal Dynamic Computation for Video Recognition
Yulin Wang
Haoji Zhang
Yang Yue
Shiji Song
Chao Deng
Junlan Feng
Gao Huang
123
4
0
15 Dec 2024
Detecting Activities of Daily Living in Egocentric Video to
  Contextualize Hand Use at Home in Outpatient Neurorehabilitation Settings
Detecting Activities of Daily Living in Egocentric Video to Contextualize Hand Use at Home in Outpatient Neurorehabilitation Settings
Adesh Kadambi
José Zariffa
EgoV
102
2
0
14 Dec 2024
Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention
  Mechanism
Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism
Jun Zheng
Jing Wang
Fuwei Zhao
Xujie Zhang
Xiaodan Liang
DiffMVGen
123
0
0
13 Dec 2024
Temporal Action Localization with Cross Layer Task Decoupling and
  Refinement
Temporal Action Localization with Cross Layer Task Decoupling and Refinement
Qiang Li
Di Liu
Jun Kong
Sen Li
Hui Xu
Jianzhong Wang
126
0
0
12 Dec 2024
Annotation Techniques for Judo Combat Phase Classification from
  Tournament Footage
Annotation Techniques for Judo Combat Phase Classification from Tournament Footage
Anthony Miyaguchi
Jed Moutahir
Tanmay Sutar
109
0
0
10 Dec 2024
Streaming Detection of Queried Event Start
Streaming Detection of Queried Event Start
Cristobal Eyzaguirre
Eric Tang
S. Buch
Adrien Gaidon
Jiajun Wu
Juan Carlos Niebles
116
0
0
04 Dec 2024
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from
  Text
SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text
Haohe Liu
Gaël Le Lan
Xinhao Mei
Zhaoheng Ni
Anurag Kumar
Varun K. Nagaraja
Wenwu Wang
Mark D. Plumbley
Yangyang Shi
Vikas Chandra
VGen
157
1
0
03 Dec 2024
Progress-Aware Video Frame Captioning
Progress-Aware Video Frame Captioning
Zihui Xue
Joungbin An
Xitong Yang
Kristen Grauman
234
1
0
03 Dec 2024
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for
  Joint Video Highlight Detection and Moment Retrieval
VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
Dhiman Paul
Md Rizwan Parvez
Nabeel Mohammed
Shafin Rahman
VGen
125
0
0
02 Dec 2024
HaGRIDv2: 1M Images for Static and Dynamic Hand Gesture Recognition
HaGRIDv2: 1M Images for Static and Dynamic Hand Gesture Recognition
Anton Nuzhdin
Alexander Nagaev
Alexander Sautin
A. Kapitanov
Karina Kvanchiani
EgoV
113
0
0
02 Dec 2024
EdgeOAR: Real-time Online Action Recognition On Edge Devices
EdgeOAR: Real-time Online Action Recognition On Edge Devices
Wei Luo
Deyu Zhang
Ying Tang
Fan Wu
Yaoxue Zhang
109
0
0
02 Dec 2024
Learner Attentiveness and Engagement Analysis in Online Education Using
  Computer Vision
Learner Attentiveness and Engagement Analysis in Online Education Using Computer Vision
Sharva Gogawale
Madhura Deshpande
Parteek Kumar
Irad Ben-Gal
77
0
0
30 Nov 2024
Hybrid Spiking Neural Network -- Transformer Video Classification Model
Aaron Bateni
102
0
0
29 Nov 2024
Learning Visual Abstract Reasoning through Dual-Stream Networks
Learning Visual Abstract Reasoning through Dual-Stream Networks
Kai Zhao
Chang Xu
Bailu Si
172
4
0
29 Nov 2024
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
Yilong Wang
Zilin Gao
Qilong Wang
Zhaofeng Chen
P. Li
Q. Hu
182
1
0
28 Nov 2024
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any
  Point in Long Video
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
Jinyuan Qu
Hongyang Li
Shilong Liu
Tianhe Ren
Zhaoyang Zeng
Lei Zhang
3DPC
138
1
0
27 Nov 2024
Online Episodic Memory Visual Query Localization with Egocentric
  Streaming Object Memory
Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory
Zaira Manigrasso
Matteo Dunnhofer
Antonino Furnari
Moritz Nottebaum
Antonio Finocchiaro
Davide Marana
G. Farinella
C. Micheloni
114
1
0
25 Nov 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
250
1
0
25 Nov 2024
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
186
0
0
24 Nov 2024
OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions
OccludeNet: A Causal Journey into Mixed-View Actor-Centric Video Action Recognition under Occlusions
Guanyu Zhou
Xiaohan Yu
Wenxin Huang
Xuemei Jia
Xian Zhong
Chia-Wen Lin
CML
125
0
0
24 Nov 2024
ACE: Action Concept Enhancement of Video-Language Models in Procedural
  Videos
ACE: Action Concept Enhancement of Video-Language Models in Procedural Videos
Reza Ghoddoosian
Nakul Agarwal
Isht Dwivedi
Behzad Darisuh
99
0
0
23 Nov 2024
When Spatial meets Temporal in Action Recognition
When Spatial meets Temporal in Action Recognition
H. Chen
Lei Wang
Yuxiao Chen
Tom Gedeon
Piotr Koniusz
166
3
0
22 Nov 2024
Privacy-Preserving Video Anomaly Detection: A Survey
Privacy-Preserving Video Anomaly Detection: A Survey
Jing Liu
Yang Liu
Xiaoguang Zhu
Jielin Li
Hao Yang
Liangyu Teng
Juncen Guo
Yan Wang
Dingkang Yang
Jing-nan Liu
193
2
0
21 Nov 2024
Principles of Visual Tokens for Efficient Video Understanding
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
191
0
0
20 Nov 2024
Video-to-Task Learning via Motion-Guided Attention for Few-Shot Action Recognition
Hanyu Guo
Wanchuan Yu
Suzhou Que
Kaiwen Du
Yan Yan
Hanzi Wang
189
1
0
18 Nov 2024
Efficient Transfer Learning for Video-language Foundation Models
Haoxing Chen
Zizheng Huang
Y. Hong
Yanshuo Wang
Zhongcai Lyu
Zhuoer Xu
Jun Lan
Zhangxuan Gu
VLM
105
0
0
18 Nov 2024
Anomaly Detection for People with Visual Impairments Using an Egocentric 360-Degree Camera
Inpyo Song
Sanghyeon Lee
Minjun Joo
Jangwon Lee
86
1
0
17 Nov 2024
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng
Tongjia Chen
Shoubin Yu
Taojiannan Yang
Lincoln Spencer
Yapeng Tian
Ajmal Mian
Joey Tianyi Zhou
Chen Chen
LRM
109
3
0
15 Nov 2024
MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation
MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation
Jonas Serych
Michal Neoral
Jirí Matas
116
3
0
14 Nov 2024
Weakly-Supervised Anomaly Detection in Surveillance Videos Based on
  Two-Stream I3D Convolution Network
Weakly-Supervised Anomaly Detection in Surveillance Videos Based on Two-Stream I3D Convolution Network
Sareh Nejad
Anwar Haque
77
1
0
13 Nov 2024
Public Health Advocacy Dataset: A Dataset of Tobacco Usage Videos from
  Social Media
Public Health Advocacy Dataset: A Dataset of Tobacco Usage Videos from Social Media
N. V. R. Chappa
Charlotte McCormick
Susana Rodriguez Gongora
P. Dobbs
Khoa Luu
137
2
0
12 Nov 2024
Improved Video VAE for Latent Video Diffusion Model
Improved Video VAE for Latent Video Diffusion Model
Pingyu Wu
Kai Zhu
Yu Liu
Liming Zhao
Wei-dong Zhai
Yang Cao
Zheng-jun Zha
VGenDiffM
86
5
0
10 Nov 2024
Pseudo-labeling with Keyword Refining for Few-Supervised Video
  Captioning
Pseudo-labeling with Keyword Refining for Few-Supervised Video Captioning
Ping Li
Tao Wang
Xinkui Zhao
Xianghua Xu
Mingli Song
71
4
0
06 Nov 2024
Learning to Unify Audio, Visual and Text for Audio-Enhanced Multilingual
  Visual Answer Localization
Learning to Unify Audio, Visual and Text for Audio-Enhanced Multilingual Visual Answer Localization
Zhibin Wen
Bin Li
79
1
0
05 Nov 2024
Conditional Vendi Score: An Information-Theoretic Approach to Diversity
  Evaluation of Prompt-based Generative Models
Conditional Vendi Score: An Information-Theoretic Approach to Diversity Evaluation of Prompt-based Generative Models
Mohammad Jalali
Azim Ospanov
Amin Gohari
Farzan Farnia
EGVM
96
4
0
05 Nov 2024
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance
PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance
Ruyang Liu
Haoran Tang
Haibo Liu
Yixiao Ge
Ying Shan
Chen Li
Jiankun Yang
VLM
70
7
0
04 Nov 2024
AM Flow: Adapters for Temporal Processing in Action Recognition
AM Flow: Adapters for Temporal Processing in Action Recognition
Tanay Agrawal
Abid Ali
A. Dantcheva
François Brémond
73
0
0
04 Nov 2024
ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
ROAD-Waymo: Action Awareness at Scale for Autonomous Driving
Salman Khan
Izzeddin Teeti
Reza Javanmard Alitappeh
Mihaela C. Stoian
Eleonora Giunchiglia
Gurkirt Singh
Andrew Bradley
Fabio Cuzzolin
95
0
0
03 Nov 2024
OnlineTAS: An Online Baseline for Temporal Action Segmentation
OnlineTAS: An Online Baseline for Temporal Action Segmentation
Qing Zhong
Guodong Ding
Angela Yao
107
3
0
02 Nov 2024
STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting
  Transformer-based Video Models
STAA: Spatio-Temporal Attention Attribution for Real-Time Interpreting Transformer-based Video Models
Zerui Wang
Yan Liu
111
1
0
01 Nov 2024
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding
  and Conditioning
Enhancing Motion in Text-to-Video Generation with Decomposed Encoding and Conditioning
Penghui Ruan
Pichao Wang
Divya Saxena
Jiannong Cao
Yuhui Shi
DiffMVGen
104
0
0
31 Oct 2024
MV-CC: Mask Enhanced Video Model for Remote Sensing Change Caption
MV-CC: Mask Enhanced Video Model for Remote Sensing Change Caption
Ruixun Liu
Kaiyu Li
Jiayi Song
Dongwei Sun
Xiangyong Cao
VGen
87
1
0
31 Oct 2024
Recovering Complete Actions for Cross-dataset Skeleton Action
  Recognition
Recovering Complete Actions for Cross-dataset Skeleton Action Recognition
Hanchao Liu
Yujiang Li
Tai-Jiang Mu
Shi-Min Hu
92
0
0
31 Oct 2024
DELTA: Dense Efficient Long-range 3D Tracking for any video
DELTA: Dense Efficient Long-range 3D Tracking for any video
Tuan Duc Ngo
Peiye Zhuang
Chuang Gan
E. Kalogerakis
Sergey Tulyakov
Hsin-Ying Lee
Chaoyang Wang
199
8
0
31 Oct 2024
Spatio-temporal Transformers for Action Unit Classification with Event
  Cameras
Spatio-temporal Transformers for Action Unit Classification with Event Cameras
Luca Cultrera
Federico Becattini
Lorenzo Berlincioni
Claudio Ferrari
A. Bimbo
78
1
0
29 Oct 2024
Previous
123456...717273
Next