Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1705.07750
Cited By
v1
v2
v3 (latest)
Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
22 May 2017
João Carreira
Andrew Zisserman
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset"
50 / 3,645 papers shown
Title
Deepfake Caricatures: Amplifying attention to artifacts increases deepfake detection by humans and machines
Camilo Luciano Fosco
Emilie Josephs
A. Andonian
Allen Lee
93
4
0
01 Jul 2025
D
2
^2
2
ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
Wenjie Pei
Qizhong Tan
Guangming Lu
Jiandong Tian
Jun Yu
135
3
0
01 Jul 2025
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Xiaodan Hu
Chuhang Zou
Suchen Wang
Jaechul Kim
Narendra Ahuja
LRM
15
0
0
20 Jun 2025
EVA02-AT: Egocentric Video-Language Understanding with Spatial-Temporal Rotary Positional Embeddings and Symmetric Optimization
Xiaoqi Wang
Yi Wang
Lap-Pui Chau
30
0
0
17 Jun 2025
Learning Event Completeness for Weakly Supervised Video Anomaly Detection
Yu Wang
Shiwei Chen
29
0
0
16 Jun 2025
DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding
Thomas Kreutz
M. Mühlhäuser
Alejandro Sánchez Guinea
34
0
0
16 Jun 2025
DejaVid: Encoder-Agnostic Learned Temporal Matching for Video Classification
Darryl Ho
Samuel Madden
AI4TS
17
0
0
14 Jun 2025
Real-Time Per-Garment Virtual Try-On with Temporal Consistency for Loose-Fitting Garments
Zaiqiang Wu
I-Chao Shen
Takeo Igarashi
26
0
0
14 Jun 2025
Low-Barrier Dataset Collection with Real Human Body for Interactive Per-Garment Virtual Try-On
Zaiqiang Wu
Yechen Li
Jingyuan Liu
Yuki Shibata
Takayuki Hori
I-Chao Shen
Takeo Igarashi
3DH
138
1
0
12 Jun 2025
Ground Reaction Force Estimation via Time-aware Knowledge Distillation
Eun Som Jeon
Sinjini Mitra
Jisoo Lee
Omik M. Save
Ankita Shukla
Hyunglae Lee
Pavan Turaga
122
0
0
12 Jun 2025
An Effective End-to-End Solution for Multimodal Action Recognition
Songping Wang
Xiantao Hu
Yueming Lyu
Caifeng Shan
70
0
0
11 Jun 2025
Synthetic Human Action Video Data Generation with Pose Transfer
Vaclav Knapp
Matyas Bohacek
89
0
0
11 Jun 2025
MPFNet: A Multi-Prior Fusion Network with a Progressive Training Strategy for Micro-Expression Recognition
Chuang Ma
Shaokai Zhao
Dongdong Zhou
Yu Pei
Zhiguo Luo
Liang Xie
Ye Yan
Erwei Yin
72
0
0
11 Jun 2025
Data-Efficient Challenges in Visual Inductive Priors: A Retrospective
Robert-Jan Bruintjes
A. Lengyel
O. Kayhan
Davide Zambrano
Nergis Tomen
Hadi Jamali Rad
Jan van Gemert
VLM
31
0
0
10 Jun 2025
Enhancing Video Memorability Prediction with Text-Motion Cross-modal Contrastive Loss and Its Application in Video Summarization
Zhiyi Zhu
Xiaoyu Wu
Youwei Lu
33
0
0
10 Jun 2025
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement
Teng Hu
Zhentao Yu
Zhengguang Zhou
Jiangning Zhang
Yuan Zhou
Qinglin Lu
Ran Yi
VGen
20
0
0
09 Jun 2025
Ambiguity-Restrained Text-Video Representation Learning for Partially Relevant Video Retrieval
CH Cho
WJ Moon
W Jun
MS Jung
JP Heo
17
0
0
09 Jun 2025
Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Boyu Chen
Siran Chen
Kunchang Li
Qinglin Xu
Yu Qiao
Yali Wang
VOS
25
0
0
09 Jun 2025
PhysLab: A Benchmark Dataset for Multi-Granularity Visual Parsing of Physics Experiments
Minghao Zou
Qingtian Zeng
Yongping Miao
Shangkun Liu
Zilong Wang
Hantao Liu
Wei Zhou
22
0
0
07 Jun 2025
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision
Yuping He
Yifei Huang
Guo Chen
Lidong Lu
Baoqi Pei
Jilan Xu
Tong Lu
Yoichi Sato
EgoV
86
0
0
06 Jun 2025
ChronoTailor: Harnessing Attention Guidance for Fine-Grained Video Virtual Try-On
Jinjuan Wang
Wenzhang Sun
Ming Li
Y. Zheng
Fanyao Li
Zhulin Tao
Donglin Di
Hao Li
Wei Chen
Xianglin Huang
VGen
AI4TS
60
0
0
06 Jun 2025
PATS: Proficiency-Aware Temporal Sampling for Multi-View Sports Skill Assessment
Edoardo Bianchi
Antonio Liotta
TTA
167
0
0
05 Jun 2025
Spike-TBR: a Noise Resilient Neuromorphic Event Representation
Gabriele Magrini. Federico Becattini
Federico Becattini
Luca Cultrera
Lorenzo Berlincioni
P. Pala
A. Bimbo
106
0
0
05 Jun 2025
MamFusion: Multi-Mamba with Temporal Fusion for Partially Relevant Video Retrieval
Xinru Ying
Jiaqi Mo
Jingyang Lin
Canghong Jin
Fangfang Wang
Lina Wei
71
0
0
04 Jun 2025
Fine-Tuning Video Transformers for Word-Level Bangla Sign Language: A Comparative Analysis for Classification Tasks
Jubayer Ahmed Bhuiyan Shawon
H. Mahmud
Kamrul Hasan
48
0
0
04 Jun 2025
HRTR: A Single-stage Transformer for Fine-grained Sub-second Action Segmentation in Stroke Rehabilitation
H. Helvaci
Justin Philip Huber
Jihye Bae
S. Cheung
64
0
0
03 Jun 2025
Go Beyond Earth: Understanding Human Actions and Scenes in Microgravity Environments
Di Wen
Lei Qi
Kunyu Peng
Kailun Yang
Fei Teng
...
Yufan Chen
R. Liu
Yitian Shi
M. Sarfraz
Rainer Stiefelhagen
64
0
0
03 Jun 2025
A Review on Coarse to Fine-Grained Animal Action Recognition
Ali Zia
Renuka Sharma
Abdelwahed Khamis
Xuesong Li
Muhammad Husnain
...
Saeed Anwar
Sabine Schmoelzl
Eric A. Stone
Lars Petersson
V. Rolland
40
0
0
01 Jun 2025
3D Skeleton-Based Action Recognition: A Review
Mengyuan Liu
Hong Liu
Qianshuo Hu
Bin Ren
Junsong Yuan
Jiaying Lin
Jiajun Wen
60
0
0
01 Jun 2025
DiG-Net: Enhancing Quality of Life through Hyper-Range Dynamic Gesture Recognition in Assistive Robotics
Eran Bamani Beeri
Eden Nissinman
A. Sintov
18
0
0
30 May 2025
Reading Recognition in the Wild
Charig Yang
Samiul Alam
Shakhrul Iman Siam
Michael J. Proulx
Lambert Mathias
...
Carl Ren
Mi Zhang
Yuning Chai
Richard Newcombe
Hyo Jin Kim
34
0
0
30 May 2025
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
Siyuan Wang
Jiawei Liu
Wei Wang
Yeying Jin
Jinsong Du
Zhi Han
SLR
VGen
61
0
0
29 May 2025
Toward Memory-Aided World Models: Benchmarking via Spatial Consistency
Kewei Lian
Shaofei Cai
Yilun Du
Yitao Liang
78
0
0
29 May 2025
PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling
X. Yu
Yan Fang
Xiaojie Jin
Yao Zhao
Yunchao Wei
47
0
0
29 May 2025
PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion
Jaehyun Choi
Jiwan Hur
Gyojin Han
Jaemyung Yu
Junmo Kim
VGen
21
0
0
28 May 2025
Dynamic-Aware Video Distillation: Optimizing Temporal Resolution Based on Video Semantics
Yinjie Zhao
Heng Zhao
Bihan Wen
Yew-Soon Ong
Joey Tianyi Zhou
VGen
18
0
0
28 May 2025
PHI: Bridging Domain Shift in Long-Term Action Quality Assessment via Progressive Hierarchical Instruction
Kanglei Zhou
Hubert P. H. Shum
Frederick W. B. Li
Xingxing Zhang
Xiaohui Liang
22
0
0
26 May 2025
Advancing Video Self-Supervised Learning via Image Foundation Models
Jingwei Wu
Zhewei Huang
Chang Liu
44
0
0
25 May 2025
ProTAL: A Drag-and-Link Video Programming Framework for Temporal Action Localization
Yuchen He
Jianbing Lv
Liqi Cheng
Lingyu Meng
Dazhen Deng
Yingcai Wu
43
0
0
23 May 2025
Multi-task Learning For Joint Action and Gesture Recognition
Konstantinos Spathis
N. Kardaris
Petros Maragos
35
0
0
23 May 2025
Temporal Consistency Constrained Transferable Adversarial Attacks with Background Mixup for Action Recognition
Ping Li
Jianan Ni
Bo Pang
AAML
250
0
0
23 May 2025
Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction
Dong Li
Wenqi Zhong
Wei Yu
Yingwei Pan
Dingwen Zhang
Ting Yao
Junwei Han
Tao Mei
DiffM
VGen
72
0
0
22 May 2025
CAD: A General Multimodal Framework for Video Deepfake Detection via Cross-Modal Alignment and Distillation
Yuxuan Du
Zhendong Wang
Yuhao Luo
Caiyong Piao
Zhiyuan Yan
Hao Li
Li Yuan
169
0
0
21 May 2025
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers
Ahmet Berke Gokmen
Yigit Ekin
Bahri Batuhan Bilecen
Aysegül Dündar
155
0
0
19 May 2025
TeleOpBench: A Simulator-Centric Benchmark for Dual-Arm Dexterous Teleoperation
Hangyu Li
Qin Zhao
Haoran Xu
Xinyu Jiang
Qingwei Ben
...
Jia Zeng
Hanqing Wang
Bo Dai
Junting Dong
Jiangmiao Pang
100
1
0
19 May 2025
Multi-Modal Artificial Intelligence of Embryo Grading and Pregnancy Prediction in Assisted Reproductive Technology: A Review
Xueqiang Ouyang
Jia Wei
121
0
0
19 May 2025
Just Dance with
π
π
π
! A Poly-modal Inductor for Weakly-supervised Video Anomaly Detection
Snehashis Majhi
Giacomo DÁmicantonio
A. Dantcheva
Quan Kong
Lorenzo Garattoni
Gianpiero Francesca
Egor Bondarev
Francois Bremond
47
0
0
19 May 2025
Automated Real-time Assessment of Intracranial Hemorrhage Detection AI Using an Ensembled Monitoring Model (EMM)
Zhongnan Fang
Andrew Johnston
Lina Cheuy
Hye Sun Na
Magdalini Paschali
...
Derrick Laurel
Andrew Walker Campion
Michael Iv
Akshay S. Chaudhari
David B. Larson
99
0
0
16 May 2025
Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?
Jianyang Xie
Yitian Zhao
Y. Meng
He Zhao
Anh Nguyen
Yalin Zheng
65
0
0
15 May 2025
SkillFormer: Unified Multi-View Video Understanding for Proficiency Estimation
Edoardo Bianchi
Antonio Liotta
59
0
0
13 May 2025
1
2
3
4
...
71
72
73
Next