ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXivPDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 1,396 papers shown
Title
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
69
0
0
06 Feb 2025
AI-Based Thermal Video Analysis in Privacy-Preserving Healthcare: A Case Study on Detecting Time of Birth
AI-Based Thermal Video Analysis in Privacy-Preserving Healthcare: A Case Study on Detecting Time of Birth
Jorge García-Torres
Øyvind Meinich-Bache
Siren Rettedal
K. Engan
45
1
0
05 Feb 2025
BRIDLE: Generalized Self-supervised Learning with Quantization
BRIDLE: Generalized Self-supervised Learning with Quantization
Hoang M. Nguyen
Satya Narayan Shukla
Qiang Zhang
Hanchao Yu
Sreya D. Roy
Taipeng Tian
Lingjiong Zhu
Yuchen Liu
SSL
MQ
84
0
0
04 Feb 2025
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
Van Thien Nguyen
William Guicquero
Gilles Sicard
3DV
MQ
82
2
0
24 Jan 2025
Can masking background and object reduce static bias for zero-shot action recognition?
Can masking background and object reduce static bias for zero-shot action recognition?
Takumi Fukuzawa
Kensho Hara
Hirokatsu Kataoka
Toru Tamaki
43
0
0
22 Jan 2025
Efficient Lung Ultrasound Severity Scoring Using Dedicated Feature Extractor
Efficient Lung Ultrasound Severity Scoring Using Dedicated Feature Extractor
Jiaqi Guo
Yunnan Wu
E. Kaimakamis
Georgios Petmezas
Vasileios E. Papageorgiou
N. Maglaveras
Aggelos K. Katsaggelos
72
0
0
21 Jan 2025
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
Zheng Chong
Wenqing Zhang
Shiyue Zhang
Jun Zheng
Xiao Dong
Haoxiang Li
Yiling Wu
D. Jiang
Xiaodan Liang
DiffM
37
1
0
20 Jan 2025
Dynamic Scene Understanding from Vision-Language Representations
Dynamic Scene Understanding from Vision-Language Representations
Shahaf Pruss
Morris Alper
Hadar Averbuch-Elor
OCL
227
0
0
20 Jan 2025
A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement Prediction
A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement Prediction
Naval Kishore Mehta
Arvind
Himanshu Kumar
Abeer Banerjee
Sumeet Saurav
Sanjay Singh
44
0
0
10 Jan 2025
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
MS-Temba : Multi-Scale Temporal Mamba for Efficient Temporal Action Detection
Arkaprava Sinha
Monish Soundar Raj
Pu Wang
Ahmed Helmy
Srijan Das
Mamba
58
3
0
10 Jan 2025
Evolving Skeletons: Motion Dynamics in Action Recognition
Evolving Skeletons: Motion Dynamics in Action Recognition
Jushang Qiu
Lei Wang
52
0
0
05 Jan 2025
SSL Framework for Causal Inconsistency between Structures and Representations
SSL Framework for Causal Inconsistency between Structures and Representations
Hang Chen
Xinyu Yang
Keqing Du
Wenya Wang
56
2
0
03 Jan 2025
Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
41
16
0
03 Jan 2025
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi
Skanda Koppula
Shreya Pathak
Justin T Chiu
Joseph Heyward
Viorica Patraucean
Jiajun Shen
Antoine Miech
Andrew Zisserman
Aida Nematzdeh
VLM
69
24
0
31 Dec 2024
Action-Agnostic Point-Level Supervision for Temporal Action Detection
Action-Agnostic Point-Level Supervision for Temporal Action Detection
Shuhei M. Yoshida
Takashi Shibata
M. Terao
Takayuki Okatani
Masashi Sugiyama
39
0
0
31 Dec 2024
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
184
0
0
18 Dec 2024
Training Strategies for Isolated Sign Language Recognition
Training Strategies for Isolated Sign Language Recognition
Karina Kvanchiani
Roman Kraynov
Elizaveta Petrova
Petr Surovcev
Aleksandr Nagaev
A. Kapitanov
84
1
0
16 Dec 2024
Progress-Aware Video Frame Captioning
Progress-Aware Video Frame Captioning
Zihui Xue
Joungbin An
Xitong Yang
Kristen Grauman
102
1
0
03 Dec 2024
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
Yilong Wang
Zilin Gao
Qilong Wang
Zhaofeng Chen
P. Li
Q. Hu
84
1
0
28 Nov 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
122
1
0
25 Nov 2024
Principles of Visual Tokens for Efficient Video Understanding
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao
Gen Li
Shreyank N. Gowda
Robert B Fisher
Jonathan Huang
Anurag Arnab
Laura Sevilla-Lara
98
0
0
20 Nov 2024
Efficient Transfer Learning for Video-language Foundation Models
Haoxing Chen
Zizheng Huang
Y. Hong
Yanshuo Wang
Zhongcai Lyu
Zhuoer Xu
Jun Lan
Zhangxuan Gu
VLM
54
0
0
18 Nov 2024
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng
Tongjia Chen
Shoubin Yu
Taojiannan Yang
Lincoln Spencer
Yapeng Tian
Ajmal Mian
Joey Tianyi Zhou
Chen Chen
LRM
68
1
0
15 Nov 2024
DELTA: Dense Efficient Long-range 3D Tracking for any video
DELTA: Dense Efficient Long-range 3D Tracking for any video
Tuan Duc Ngo
Peiye Zhuang
Chuang Gan
E. Kalogerakis
Sergey Tulyakov
Hsin-Ying Lee
Chaoyang Wang
57
5
0
31 Oct 2024
Investigating Memorization in Video Diffusion Models
Investigating Memorization in Video Diffusion Models
Chong Chen
Enhuai Liu
Daochang Liu
M. Shah
Chang Xu
VGen
DiffM
86
1
0
29 Oct 2024
Frontiers in Intelligent Colonoscopy
Frontiers in Intelligent Colonoscopy
Ge-Peng Ji
Jingyi Liu
Peng Xu
Nick Barnes
Fahad Shahbaz Khan
Salman Khan
Deng-Ping Fan
49
4
0
22 Oct 2024
Understanding Spatio-Temporal Relations in Human-Object Interaction
  using Pyramid Graph Convolutional Network
Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network
Hao Xing
Darius Burschka
42
11
0
10 Oct 2024
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness
Boqian Wu
Q. Xiao
Shunxin Wang
N. Strisciuglio
Mykola Pechenizkiy
M. V. Keulen
Decebal Constantin Mocanu
Elena Mocanu
OOD
3DH
57
0
0
03 Oct 2024
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion
Dexuan Ding
Lei Wang
Liyun Zhu
Tom Gedeon
Piotr Koniusz
44
4
0
02 Oct 2024
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Xinrui Zhou
Yuhao Huang
Haoran Dou
Shijing Chen
Ao Chang
...
Jie Jessie Ren
Ruobing Huang
Jun Cheng
Wufeng Xue
Dong Ni
MedIm
192
0
0
25 Sep 2024
Neuromorphic Facial Analysis with Cross-Modal Supervision
Neuromorphic Facial Analysis with Cross-Modal Supervision
Federico Becattini
Luca Cultrera
Lorenzo Berlincioni
Claudio Ferrari
Andrea Leonardo
A. Bimbo
CVBM
3DH
59
0
0
16 Sep 2024
Uncertainty-Guided Appearance-Motion Association Network for Out-of-Distribution Action Detection
Uncertainty-Guided Appearance-Motion Association Network for Out-of-Distribution Action Detection
Xiang Fang
Arvind Easwaran
B. Genest
36
4
0
16 Sep 2024
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
Ling Xing
Hongyu Qu
Rui Yan
Xiangbo Shu
Jinhui Tang
45
1
0
12 Sep 2024
Deep Learning for Video Anomaly Detection: A Review
Deep Learning for Video Anomaly Detection: A Review
Peng Wu
Chengyu Pan
Yuting Yan
Guansong Pang
Peng Wang
Yanning Zhang
VLM
AI4TS
48
6
0
09 Sep 2024
Introducing Gating and Context into Temporal Action Detection
Introducing Gating and Context into Temporal Action Detection
Aglind Reka
Diana Laura Borza
Dominick Reilly
Michal Balazia
Francois Bremond
30
0
0
06 Sep 2024
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Self-Supervised Contrastive Learning for Videos using Differentiable Local Alignment
Keyne Oei
Amr Gomaa
Anna Maria Feit
João Belo
33
0
0
06 Sep 2024
Towards Student Actions in Classroom Scenes: New Dataset and Baseline
Towards Student Actions in Classroom Scenes: New Dataset and Baseline
Zhuolin Tan
Chenqiang Gao
Anyong Qin
Ruixin Chen
Tiecheng Song
Feng Yang
Deyu Meng
31
0
0
02 Sep 2024
GMFL-Net: A Global Multi-geometric Feature Learning Network for
  Repetitive Action Counting
GMFL-Net: A Global Multi-geometric Feature Learning Network for Repetitive Action Counting
Jun Li
Jinying Wu
Qiming Li
Feifei Guo
47
0
0
31 Aug 2024
Automated Vehicle Driver Monitoring Dataset from Real-World Scenarios
Automated Vehicle Driver Monitoring Dataset from Real-World Scenarios
Mohamed Sabry
Walter Morales-Alvarez
Cristina Olaverri-Monreal
37
0
0
19 Aug 2024
Weakly Supervised Video Anomaly Detection and Localization with
  Spatio-Temporal Prompts
Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts
Peng Wu
Xuerong Zhou
Guansong Pang
Zhiwei Yang
Qingsen Yan
Peng Wang
Yanning Zhang
35
9
0
12 Aug 2024
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognition
Ahmed Abdelkawy
Asem A. Ali
Asem Ali
3DPC
31
0
0
10 Aug 2024
Lighthouse: A User-Friendly Library for Reproducible Video Moment
  Retrieval and Highlight Detection
Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight Detection
Taichi Nishimura
Shota Nakada
Hokuto Munakata
Tatsuya Komatsu
VLM
34
1
0
06 Aug 2024
RICA2: Rubric-Informed, Calibrated Assessment of Actions
RICA2: Rubric-Informed, Calibrated Assessment of Actions
Abrar Majeedi
Viswanatha Reddy Gajjala
Satya Sai Srinath Namburi Gnvv
Yin Li
CML
31
2
0
04 Aug 2024
Faster Diffusion Action Segmentation
Faster Diffusion Action Segmentation
Shuai Wang
Shunli Wang
Mingcheng Li
Dingkang Yang
Haopeng Kuang
Ziyun Qian
Lihua Zhang
42
0
0
04 Aug 2024
MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for
  Efficient Pedestrian Detection
MambaST: A Plug-and-Play Cross-Spectral Spatial-Temporal Fuser for Efficient Pedestrian Detection
Xiangbo Gao
A. Kanu-Asiegbu
Xiaoxiao Du
Mamba
38
0
0
02 Aug 2024
Con4m: Context-aware Consistency Learning Framework for Segmented Time Series Classification
Con4m: Context-aware Consistency Learning Framework for Segmented Time Series Classification
Junru Chen
Tianyu Cao
Ninon De Mecquenem
Jiahe Li
Zhilong Chen
F. Friederici
Yang Yang
46
1
0
31 Jul 2024
Semi-Supervised Teacher-Reference-Student Architecture for Action Quality Assessment
Semi-Supervised Teacher-Reference-Student Architecture for Action Quality Assessment
Wu Yun
Mengshi Qi
Fei Peng
Huadong Ma
46
1
0
29 Jul 2024
Start from Video-Music Retrieval: An Inter-Intra Modal Loss for Cross
  Modal Retrieval
Start from Video-Music Retrieval: An Inter-Intra Modal Loss for Cross Modal Retrieval
Zeyu Chen
Pengfei Zhang
Kai Ye
Wei Dong
Xin Feng
Yana Zhang
43
0
0
28 Jul 2024
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
Ego-VPA: Egocentric Video Understanding with Parameter-efficient Adaptation
Tz-Ying Wu
Kyle Min
Subarna Tripathi
Nuno Vasconcelos
EgoV
55
0
0
28 Jul 2024
A Comprehensive Review of Few-shot Action Recognition
A Comprehensive Review of Few-shot Action Recognition
Yuyang Wanyan
Xiaoshan Yang
Weiming Dong
Changsheng Xu
VLM
80
3
0
20 Jul 2024
Previous
12345...262728
Next