ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1406.2199
  4. Cited By
Two-Stream Convolutional Networks for Action Recognition in Videos
v1v2 (latest)

Two-Stream Convolutional Networks for Action Recognition in Videos

9 June 2014
Karen Simonyan
Andrew Zisserman
ArXiv (abs)PDFHTML

Papers citing "Two-Stream Convolutional Networks for Action Recognition in Videos"

50 / 2,289 papers shown
Title
D$^2$ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
D2^22ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition
Wenjie Pei
Qizhong Tan
Guangming Lu
Jiandong Tian
Jun Yu
135
3
0
01 Jul 2025
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Language-driven Description Generation and Common Sense Reasoning for Video Action Recognition
Xiaodan Hu
Chuhang Zou
Suchen Wang
Jaechul Kim
Narendra Ahuja
LRM
15
0
0
20 Jun 2025
An Effective End-to-End Solution for Multimodal Action Recognition
Songping Wang
Xiantao Hu
Yueming Lyu
Caifeng Shan
70
0
0
11 Jun 2025
Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Super Encoding Network: Recursive Association of Multi-Modal Encoders for Video Understanding
Boyu Chen
Siran Chen
Kunchang Li
Qinglin Xu
Yu Qiao
Yali Wang
VOS
25
0
0
09 Jun 2025
AugmentGest: Can Random Data Cropping Augmentation Boost Gesture Recognition Performance?
AugmentGest: Can Random Data Cropping Augmentation Boost Gesture Recognition Performance?
Nada Aboudeshish
D. Ignatov
Radu Timofte
51
3
0
08 Jun 2025
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision
Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision
Yuping He
Yifei Huang
Guo Chen
Lidong Lu
Baoqi Pei
Jilan Xu
Tong Lu
Yoichi Sato
EgoV
84
0
0
06 Jun 2025
Efficient Egocentric Action Recognition with Multimodal Data
Efficient Egocentric Action Recognition with Multimodal Data
Marco Calzavara
Ard Kastrati
Matteo Macchini
Dushan Vasilevski
Roger Wattenhofer
EgoV
60
0
0
02 Jun 2025
3D Skeleton-Based Action Recognition: A Review
3D Skeleton-Based Action Recognition: A Review
Mengyuan Liu
Hong Liu
Qianshuo Hu
Bin Ren
Junsong Yuan
Jiaying Lin
Jiajun Wen
60
0
0
01 Jun 2025
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
Scene Detection Policies and Keyframe Extraction Strategies for Large-Scale Video Analysis
Vasilii Korolkov
23
0
0
31 May 2025
Spatiotemporal Analysis of Forest Machine Operations Using 3D Video Classification
Spatiotemporal Analysis of Forest Machine Operations Using 3D Video Classification
Maciej Wielgosz
Simon Berg
Heikki Korpunen
Stephan Hoffmann
17
0
0
30 May 2025
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning
VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning
Liyun Zhu
Qixiang Chen
Xi Shen
Xiaodong Cun
AI4TSLRM
78
0
0
29 May 2025
Vid-SME: Membership Inference Attacks against Large Video Understanding Models
Vid-SME: Membership Inference Attacks against Large Video Understanding Models
Qi Li
Runpeng Yu
Xinchao Wang
22
2
0
29 May 2025
CA3D: Convolutional-Attentional 3D Nets for Efficient Video Activity Recognition on the Edge
CA3D: Convolutional-Attentional 3D Nets for Efficient Video Activity Recognition on the Edge
Gabriele Lagani
Fabrizio Falchi
Claudio Gennaro
Giuseppe Amato
22
0
0
26 May 2025
Multi-task Learning For Joint Action and Gesture Recognition
Multi-task Learning For Joint Action and Gesture Recognition
Konstantinos Spathis
N. Kardaris
Petros Maragos
35
0
0
23 May 2025
Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?
Are Spatial-Temporal Graph Convolution Networks for Human Action Recognition Over-Parameterized?
Jianyang Xie
Yitian Zhao
Y. Meng
He Zhao
Anh Nguyen
Yalin Zheng
65
0
0
15 May 2025
Read My Ears! Horse Ear Movement Detection for Equine Affective State Assessment
Read My Ears! Horse Ear Movement Detection for Equine Affective State Assessment
João Alves
Pia Haubro Andersen
Rikke Gade
65
0
0
06 May 2025
ZS-VCOS: Zero-Shot Outperforms Supervised Video Camouflaged Object Segmentation
ZS-VCOS: Zero-Shot Outperforms Supervised Video Camouflaged Object Segmentation
Wenqi Guo
Shan Du
VLM
97
0
0
10 Apr 2025
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung
Byung Cheol Song
AAMLVLMMQ
91
0
0
07 Apr 2025
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Is Temporal Prompting All We Need For Limited Labeled Action Recognition?
Shreyank N. Gowda
Boyan Gao
Xiao Gu
Xiaobo Jin
VLM
91
0
0
02 Apr 2025
Sample-level Adaptive Knowledge Distillation for Action Recognition
Sample-level Adaptive Knowledge Distillation for Action Recognition
Ping Li
Chenhao Ping
Wenxiao Wang
Mingli Song
138
0
0
01 Apr 2025
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
Shuyu Li
Shulei Ji
Zihao Wang
Songruoyao Wu
Jiaxing Yu
Kai Zhang
MGenVGen
295
1
0
01 Apr 2025
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Lucas Ventura
Antoine Yang
Cordelia Schmid
Gül Varol
96
0
0
31 Mar 2025
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
CA^2ST: Cross-Attention in Audio, Space, and Time for Holistic Video Recognition
Jongseo Lee
Joohyun Chang
Dongho Lee
Jinwoo Choi
251
0
0
30 Mar 2025
CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition
CMD-HAR: Cross-Modal Disentanglement for Wearable Human Activity Recognition
Hanyu Liu
Siyao Li
Ying Yu
Yixuan Jiang
Hang Xiao
Jingxi Long
Haotian Tang
Chao Li
77
0
0
27 Mar 2025
BEAR: A Video Dataset For Fine-grained Behaviors Recognition Oriented with Action and Environment Factors
BEAR: A Video Dataset For Fine-grained Behaviors Recognition Oriented with Action and Environment Factors
Chengyang Hu
Yuduo Chen
Lizhuang Ma
93
0
0
26 Mar 2025
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding
Zichen Liu
Kunlun Xu
Fuchun Sun
Xu Zou
Yuxin Peng
Jiahuan Zhou
VLMAI4TS
193
2
0
20 Mar 2025
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Towards Scalable Modeling of Compressed Videos for Efficient Action Recognition
Shristi Das Biswas
Efstathia Soufleri
Arani Roy
Kaushik Roy
116
0
0
17 Mar 2025
Long-VMNet: Accelerating Long-Form Video Understanding via Fixed Memory
Long-VMNet: Accelerating Long-Form Video Understanding via Fixed Memory
Saket Gurukar
Asim Kadav
VLM
144
0
0
17 Mar 2025
Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information
Edoardo Bianchi
Oswald Lanz
3DH
106
2
0
06 Mar 2025
BdSLW401: Transformer-Based Word-Level Bangla Sign Language Recognition Using Relative Quantization Encoding (RQE)
Husne Ara Rubaiyeat
Njayou Youssouf
Md. Kamrul Hasan
H. Mahmud
SLR
92
1
0
04 Mar 2025
Solar Multimodal Transformer: Intraday Solar Irradiance Predictor using Public Cameras and Time Series
Yanan Niu
Roy Sarkis
D. Psaltis
Mario Paolone
Christophe Moser
Luisa Lambertini
131
0
0
28 Feb 2025
Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion
Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion
Qingyuan Jiang
Longfei Huang
Yang Yang
92
0
0
27 Feb 2025
MICINet: Multi-Level Inter-Class Confusing Information Removal for Reliable Multimodal Classification
MICINet: Multi-Level Inter-Class Confusing Information Removal for Reliable Multimodal Classification
Tianze Zhang
Shu Shen
Chao Chen
116
0
0
27 Feb 2025
ASurvey: Spatiotemporal Consistency in Video Generation
ASurvey: Spatiotemporal Consistency in Video Generation
Zhiyu Yin
Kehai Chen
Xuefeng Bai
Ruili Jiang
Junlin Li
Hongdong Li
Jin Liu
Yang Xiang
Jun Yu
Min Zhang
EGVMVGenAI4TS
94
0
0
25 Feb 2025
Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
Balanced Representation Learning for Long-tailed Skeleton-based Action Recognition
Hongda Liu
Yunlong Wang
Min Ren
Junxing Hu
Zhengquan Luo
Guangqi Hou
Zhe Sun
126
1
0
24 Feb 2025
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Enhancing Video Understanding: Deep Neural Networks for Spatiotemporal Analysis
Amir Hosein Fadaei
M. Dehaqani
89
0
0
11 Feb 2025
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
MD-BERT: Action Recognition in Dark Videos via Dynamic Multi-Stream Fusion and Temporal Modeling
Sharana Dharshikgan Suresh Dass
H. Barua
Ganesh Krishnasamy
Raveendran Paramesran
Raphael C.-W. Phan
130
0
0
06 Feb 2025
BRIDLE: Generalized Self-supervised Learning with Quantization
BRIDLE: Generalized Self-supervised Learning with Quantization
Hoang M. Nguyen
Satya Narayan Shukla
Qiang Zhang
Hanchao Yu
Sreya D. Roy
Taipeng Tian
Lingjiong Zhu
Yuchen Liu
SSLMQ
138
0
0
04 Feb 2025
Can masking background and object reduce static bias for zero-shot action recognition?
Can masking background and object reduce static bias for zero-shot action recognition?
Takumi Fukuzawa
Kensho Hara
Hirokatsu Kataoka
Toru Tamaki
122
1
0
22 Jan 2025
High-Performance Inference Graph Convolutional Networks for Skeleton-Based Action Recognition
High-Performance Inference Graph Convolutional Networks for Skeleton-Based Action Recognition
Ziao Li
Junyi Wang
Bangli Liu
Haibin Cai
Mohamad Saada
Guhong Nie
3DH
89
0
0
08 Jan 2025
Multiscaled Multi-Head Attention-based Video Transformer Network for Hand Gesture Recognition
Mallika Garg
Debashis Ghosh
P. M. Pradhan
SLR
113
16
0
03 Jan 2025
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi
Skanda Koppula
Shreya Pathak
Justin T Chiu
Joseph Heyward
Viorica Patraucean
Jiajun Shen
Antoine Miech
Andrew Zisserman
Aida Nematzdeh
VLM
134
26
0
31 Dec 2024
Scaling 4D Representations
Scaling 4D Representations
João Carreira
Dilara Gokay
Michael King
Chuhan Zhang
Ignacio Rocco
...
Viorica Patraucean
Dima Damen
Pauline Luc
Mehdi S. M. Sajjadi
Andrew Zisserman
138
5
0
19 Dec 2024
CompactFlowNet: Efficient Real-time Optical Flow Estimation on Mobile
  Devices
CompactFlowNet: Efficient Real-time Optical Flow Estimation on Mobile Devices
Andrei Znobishchev
Valerii Filev
Oleg Kudashev
Nikita Orlov
Humphrey Shi
123
0
0
17 Dec 2024
Future Aspects in Human Action Recognition: Exploring Emerging
  Techniques and Ethical Influences
Future Aspects in Human Action Recognition: Exploring Emerging Techniques and Ethical Influences
Antonios Gasteratos
Stavros N. Moutsis
Konstantinos A. Tsintotas
Yiannis Aloimonos
79
0
0
17 Dec 2024
EdgeOAR: Real-time Online Action Recognition On Edge Devices
EdgeOAR: Real-time Online Action Recognition On Edge Devices
Wei Luo
Deyu Zhang
Ying Tang
Fan Wu
Yaoxue Zhang
109
0
0
02 Dec 2024
Learning Visual Abstract Reasoning through Dual-Stream Networks
Learning Visual Abstract Reasoning through Dual-Stream Networks
Kai Zhao
Chang Xu
Bailu Si
172
4
0
29 Nov 2024
A Novel Approach to Image Steganography Using Generative Adversarial Networks
Waheed Rehman
GAN
102
2
0
27 Nov 2024
A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation Learning
Luis Vilaca
Yi Yu
Paula Vinan
186
0
0
24 Nov 2024
When Spatial meets Temporal in Action Recognition
When Spatial meets Temporal in Action Recognition
H. Chen
Lei Wang
Yuxiao Chen
Tom Gedeon
Piotr Koniusz
166
3
0
22 Nov 2024
1234...444546
Next