ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1705.07750
  4. Cited By
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

22 May 2017
João Carreira
Andrew Zisserman
ArXivPDFHTML

Papers citing "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"

50 / 1,396 papers shown
Title
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in
  Streaming Videos
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming Videos
Hyolim Kang
Jeongseok Hyun
Joungbin An
Youngjae Yu
Seon Joo Kim
38
0
0
17 Jul 2024
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rethinking Image-to-Video Adaptation: An Object-centric Perspective
Rui Qian
Shuangrui Ding
Dahua Lin
OCL
52
1
0
09 Jul 2024
Exploiting Heterogeneity in Timescales for Sparse Recurrent Spiking
  Neural Networks for Energy-Efficient Edge Computing
Exploiting Heterogeneity in Timescales for Sparse Recurrent Spiking Neural Networks for Energy-Efficient Edge Computing
Biswadeep Chakraborty
Saibal Mukhopadhyay
38
2
0
08 Jul 2024
MMAD: Multi-label Micro-Action Detection in Videos
MMAD: Multi-label Micro-Action Detection in Videos
Kun Li
Pengyu Liu
Pengyu Liu
Guoliang Chen
Zhiliang Wu
Hehe Fan
Meng Wang
47
7
0
07 Jul 2024
Tarsier: Recipes for Training and Evaluating Large Video Description
  Models
Tarsier: Recipes for Training and Evaluating Large Video Description Models
Jiawei Wang
Liping Yuan
Yuchen Zhang
47
52
0
30 Jun 2024
Open-Vocabulary Temporal Action Localization using Multimodal Guidance
Open-Vocabulary Temporal Action Localization using Multimodal Guidance
Akshita Gupta
Aditya Arora
Sanath Narayan
Salman Khan
Fahad Shahbaz Khan
Graham W. Taylor
41
3
0
21 Jun 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou
Teli Ma
Kun-Yu Lin
Ronghe Qiu
Zifan Wang
Junwei Liang
54
5
0
20 Jun 2024
Self-Supervised Representation Learning with Spatial-Temporal
  Consistency for Sign Language Recognition
Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition
Weichao Zhao
Wengang Zhou
Hezhen Hu
Min Wang
Houqiang Li
SLR
43
2
0
15 Jun 2024
Video-based Exercise Classification and Activated Muscle Group
  Prediction with Hybrid X3D-SlowFast Network
Video-based Exercise Classification and Activated Muscle Group Prediction with Hybrid X3D-SlowFast Network
Manvik Pasula
Pramit Saha
29
0
0
10 Jun 2024
An Effective-Efficient Approach for Dense Multi-Label Action Detection
An Effective-Efficient Approach for Dense Multi-Label Action Detection
Faegheh Sardari
Armin Mustafa
Philip J. B. Jackson
Adrian Hilton
37
0
0
10 Jun 2024
Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection
Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection
Jash Dalvi
Ali Dabouei
Gunjan Dhanuka
Min Xu
23
0
0
05 Jun 2024
EMAG: Ego-motion Aware and Generalizable 2D Hand Forecasting from
  Egocentric Videos
EMAG: Ego-motion Aware and Generalizable 2D Hand Forecasting from Egocentric Videos
Masashi Hatano
Ryo Hachiuma
Hideo Saito
EgoV
37
3
0
30 May 2024
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Identity-free Artificial Emotional Intelligence via Micro-Gesture Understanding
Rong Gao
Xin Liu
Bohao Xing
Zitong Yu
Björn W. Schuller
Heikki Kälviäinen
57
3
0
21 May 2024
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data
Xinyi Wang
Grazziela Figueredo
Ruizhe Li
W. Zhang
Weitong Chen
Xin Chen
MedIm
ViT
49
2
0
21 May 2024
ViViD: Video Virtual Try-on using Diffusion Models
ViViD: Video Virtual Try-on using Diffusion Models
Zixun Fang
Wei Zhai
Aimin Su
Hongliang Song
Kai Zhu
Mao Wang
Yu Chen
Zhiheng Liu
Yang Cao
Zheng-jun Zha
DiffM
VGen
52
7
0
20 May 2024
Networking Systems for Video Anomaly Detection: A Tutorial and Survey
Networking Systems for Video Anomaly Detection: A Tutorial and Survey
Jing Liu
Yang Liu
Jieyu Lin
Jielin Li
Peng Sun
Bo Hu
Liang Song
Azzedine Boukerche
Victor C.M. Leung
Victor C.M. Leung
43
10
0
16 May 2024
No Time to Waste: Squeeze Time into Channel for Mobile Video
  Understanding
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Yingjie Zhai
Wenshuo Li
Yehui Tang
Xinghao Chen
Yunhe Wang
ViT
30
0
0
14 May 2024
Bidirectional Progressive Transformer for Interaction Intention
  Anticipation
Bidirectional Progressive Transformer for Interaction Intention Anticipation
Zichen Zhang
Hongcheng Luo
Wei Zhai
Yang Cao
Yu Kang
45
5
0
09 May 2024
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba
Hongwei Ren
Yue Zhou
Jiadong Zhu
Haotian Fu
Yulong Huang
Xiaopeng Lin
Yuetong Fang
Fei Ma
Hao Yu
Bo-Xun Cheng
Mamba
43
9
0
09 May 2024
Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global
  Temporal Defect Based Detection Method
Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method
Peisong He
Leyao Zhu
Jiaxing Li
Shiqi Wang
Haoliang Li
EGVM
23
2
0
07 May 2024
Enhancing Micro Gesture Recognition for Emotion Understanding via
  Context-aware Visual-Text Contrastive Learning
Enhancing Micro Gesture Recognition for Emotion Understanding via Context-aware Visual-Text Contrastive Learning
Deng Li
Bohao Xing
Xin Liu
37
5
0
03 May 2024
MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition
MVP-Shot: Multi-Velocity Progressive-Alignment Framework for Few-Shot Action Recognition
Hongyu Qu
Rui Yan
Xiangbo Shu
Haoliang Gao
Peng Huang
Guo-Sen Xie
61
4
0
03 May 2024
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model
EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model
Deng Li
Xin Liu
Bohao Xing
Baiqiang Xia
Yuan Zong
Bihan Wen
Heikki Kälviäinen
42
3
0
01 May 2024
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal
  Multi-scale and Action Label Features
One-Stage Open-Vocabulary Temporal Action Detection Leveraging Temporal Multi-scale and Action Label Features
Trung Thanh Nguyen
Yasutomo Kawanishi
Takahiro Komamizu
Ichiro Ide
VLM
33
3
0
30 Apr 2024
Learning Discriminative Spatio-temporal Representations for
  Semi-supervised Action Recognition
Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition
Yu Wang
Sanpin Zhou
Kun Xia
Le Wang
42
0
0
25 Apr 2024
IPAD: Industrial Process Anomaly Detection Dataset
IPAD: Industrial Process Anomaly Detection Dataset
Jinfan Liu
Yichao Yan
Junjie Li
Weiming Zhao
Pengzhi Chu
Xingdong Sheng
Yunhui Liu
Xiaokang Yang
43
0
0
23 Apr 2024
Video sentence grounding with temporally global textual knowledge
Video sentence grounding with temporally global textual knowledge
Cai Chen
Runzhong Zhang
Jianjun Gao
Kejun Wu
Kim-Hui Yap
Yi Wang
32
0
0
21 Apr 2024
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint
  Moment Retrieval and Highlight Detection
Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection
Jin Yang
Ping Wei
Huan Li
Ziyang Ren
51
8
0
14 Apr 2024
An Animation-based Augmentation Approach for Action Recognition from
  Discontinuous Video
An Animation-based Augmentation Approach for Action Recognition from Discontinuous Video
Xingyu Song
Zhan Li
Shi Chen
Xin-Qiang Cai
K. Demachi
28
2
0
10 Apr 2024
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large
  Multi-Modal Models
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
David Kurzendörfer
Otniel-Bogdan Mercea
A. Sophia Koepke
Zeynep Akata
VLM
CLIP
33
2
0
09 Apr 2024
Social-MAE: Social Masked Autoencoder for Multi-person Motion
  Representation Learning
Social-MAE: Social Masked Autoencoder for Multi-person Motion Representation Learning
Mahsa Ehsanpour
Ian Reid
Hamid Rezatofighi
ViT
40
0
0
08 Apr 2024
Spatio-Temporal Attention and Gaussian Processes for Personalized Video
  Gaze Estimation
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation
Swati Jindal
Mohit Yadav
Roberto Manduchi
37
5
0
08 Apr 2024
Koala: Key frame-conditioned long video-LLM
Koala: Key frame-conditioned long video-LLM
Reuben Tan
Ximeng Sun
Ping Hu
Jui-hsien Wang
Hanieh Deilamsalehy
Bryan A. Plummer
Bryan C. Russell
Kate Saenko
38
36
0
05 Apr 2024
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection
Chih-Chung Hsu
Chia-Ming Lee
Chiang Fan Yang
Yi-Shiuan Chou
Chih-Yu Jiang
Shen-Chieh Tai
Chin-Han Tsai
44
0
0
02 Apr 2024
$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video
  Temporal Grounding
R2R^2R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
Ye Liu
Jixuan He
Wanhua Li
Junsik Kim
D. Wei
Hanspeter Pfister
Chang Wen Chen
46
13
0
31 Mar 2024
Benchmarking the Robustness of Temporal Action Detection Models Against
  Temporal Corruptions
Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions
Runhao Zeng
Xiaoyong Chen
Jiaming Liang
Huisi Wu
Guangzhong Cao
Yong Guo
AAML
39
4
0
29 Mar 2024
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
Yash Jain
David M. Chan
Pranav Dheram
Aparna Khare
Olabanji Shonibare
Venkatesh Ravichandran
Shalini Ghosh
40
2
0
28 Mar 2024
PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal
  Action Localization
PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action Localization
Edward Fish
Jon Weinbren
Andrew Gilbert
49
1
0
27 Mar 2024
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World
Yifei Huang
Guo Chen
Jilan Xu
Mingfang Zhang
Lijin Yang
...
Hongjie Zhang
Lu Dong
Yali Wang
Limin Wang
Yu Qiao
EgoV
68
38
0
24 Mar 2024
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Jingjing Hu
Dan Guo
Kun Li
Zhan Si
Xun Yang
Xiaojun Chang
Meng Wang
61
3
0
21 Mar 2024
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
VURF: A General-purpose Reasoning and Self-refinement Framework for Video Understanding
Ahmad A Mahmood
Ashmal Vayani
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
LRM
56
7
0
21 Mar 2024
Advancing COVID-19 Detection in 3D CT Scans
Advancing COVID-19 Detection in 3D CT Scans
Qingqiu Li
Runtian Yuan
Junlin Hou
Jilan Xu
Yuejie Zhang
Rui Feng
Hao Chen
34
1
0
18 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
40
5
0
14 Mar 2024
Intention-driven Ego-to-Exo Video Generation
Intention-driven Ego-to-Exo Video Generation
Hongcheng Luo
Kai Zhu
Wei Zhai
Yang Cao
DiffM
VGen
42
4
0
14 Mar 2024
Density-Guided Label Smoothing for Temporal Localization of Driving
  Actions
Density-Guided Label Smoothing for Temporal Localization of Driving Actions
Tunç Alkanat
Erkut Akdag
Egor Bondarev
Peter H. N. de With
38
4
0
11 Mar 2024
Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge
Advancing Generalizable Remote Physiological Measurement through the Integration of Explicit and Implicit Prior Knowledge
Yuting Zhang
Haobo Lu
Xin Liu
Ying-Cong Chen
Kaishun Wu
32
4
0
11 Mar 2024
Video Relationship Detection Using Mixture of Experts
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
39
1
0
06 Mar 2024
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via
  Temporal-Viewpoint Alignment
Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment
Lei Wang
Jun Liu
Liang Zheng
Tom Gedeon
Piotr Koniusz
33
9
0
07 Feb 2024
Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery
Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery
Lianhao Yin
Yutong Ban
J. Eckhoff
O. Meireles
Daniela Rus
Guy Rosman
41
1
0
03 Feb 2024
Multimodal Action Quality Assessment
Multimodal Action Quality Assessment
Ling-an Zeng
Wei-Shi Zheng
43
13
0
31 Jan 2024
Previous
123456...262728
Next